Bridging Data Science and Big Data Analytics: Mathematical Foundations for Innovation and Scalable Efficiency
DOI:
https://doi.org/10.58916/jhas.v10i2.719الكلمات المفتاحية:
Big data Analysis, Data Analysis, Optimization, Integrationالملخص
Big data analytics and data science have revolutionized our ability to extract actionable insights from massive datasets through advanced mathematical methodologies. This article focuses on key aspects such as statistical inference, optimization, and linear algebra, addressing the challenges of ensuring data security and seamless integration despite the complexity of large-scale datasets. By combining empirical evidence from diverse global contexts, the discussion highlights strategies to enhance data-driven decision-making while exploring the ethical dilemmas and privacy concerns associated with big data. Through practical examples and grounded analysis, this work aims to bridge the gap between theoretical understanding and real-world application in data science and analytics
التنزيلات
المراجع
• Boyd, S., & Vandenberghe, L. (2004). Convex Optimization. Cambridge University Press.
• Dean, J., & Ghemawat, S. (2008). MapReduce: Simplified data processing on large clusters. Communications of the ACM, 51(1), 107-113.
• Dean, J., Corrado, G. S., Monga, R., Rajaraman, K., Senior, A., Vanhoucke, V., Vinyals, O., & Warden, P. (2012). Large scale distributed deep networks. Advances in Neural Information Processing Systems, 25, 1223-1231.
• Dwork, C., & Roth, A. (2014). The Algorithmic Foundations of Differential Privacy. Foundations and Trends in Theoretical Computer Science, 9(3-4), 211-407.
• Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep Learning. MIT Press.
• Hinton, G. E., & Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. Science, 313(5786), 504-507.
• Jolliffe, I. (2002). Principal Component Analysis. Springer.
• Martens, J. (2010). Deep learning via Hessian-free optimization. Proceedings of the 27th International Conference on Machine Learning (ICML-10), 735-742.
• McMahan, H. B., Moore, E., Ramage, D., Hampson, S., & Arcas, B. A. (2017). Communication-efficient learning of deep networks from decentralized data. Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS), 1273-1282.
• O'Neil, C. (2016). Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy. Crown.
• Provost, F., & Fawcett, T. (2013). Data Science for Business: What You Need to Know About Data Mining and Data-Analytic Thinking. O'Reilly Media.
• Shmatikov, V., & Avidgor, D. (2008). Secure Multiparty Computation (Synthesis Lectures on Information Security, Privacy, and Trust). Morgan & Claypool Publishers.
• Tenenbaum, J. B., De Silva, V., & Langford, J. C. (2000). A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500), 2319-2323.
• Van der Maaten, L., & Hinton, G. (2008). Visualizing data using t-SNE. Journal of Machine Learning Research, 9(Nov), 2579-2605.
• Zaharia, M., Chowdhury, M., Franklin, M. J., Shenker, S., & Stoica, I. (2012). Spark: cluster computing with working sets. HotCloud, 12(10-10), 1.