A comparative of Imputation Techniques for Missing Data in Collaborative Filtering Using Apache Mahout

المؤلفون

  • Morad Ali Hassan faculty of Science – Bani Waleed University مؤلف
  • Mohamed Abdo ulwahad Alsharaa Faculty of Education – Bani Waleed University مؤلف

DOI:

https://doi.org/10.58916/jhas.v8i1.502

الكلمات المفتاحية:

Recommender systems، collaborative filtering، Apache Mahout، Imputation techniques، Big Data, missing and noisy data; predicting.

الملخص

Recommender systems are a powerful tool that can be used to improve the user experience in a variety of applications. However, the issue of missing data in the user-item rating matrix is a common problem that affects the performance of these systems. To solve this problem, imputation techniques are used to estimate the missing values in the matrix. Apache Mahout is one of the popular open-source libraries that provide various algorithms for building recommender systems. It also provides an implementation of several imputation techniques to handle missing data in the user-item rating matrix. This paper aims to improve the accuracy and the performance of user-based collaborative filtering (UB-CF) by applying the imputation technique with Apache Mahout. The experiments are carried out on real world data sets Movielens. The results proved that our proposed method is effective in handling and identifying missing and noisy data in the user-item rating matrix. We demonstrate that our approach led to considerable enhancement compared with other previous approaches.

التنزيلات

تنزيل البيانات ليس متاحًا بعد.

المراجع

- Su, X., & Khoshgoftaar, T. M. (2009). A survey of collaborative filtering techniques. Artificial Intelligence Review, 31(1), 103-132.

- Zhang, X., & Liu, H. (2016). A technique of recursive reliability-based missing data imputation for collaborative filtering. Applied Sciences, 6(8), 3719.

- Su, X., Khoshgoftaar, T. M., Zhu, X., & Greiner, R. (2008). Imputation-boosted collaborative filtering using machine learning classifiers. In Proceedings of the 2008 ACM symposium on applied computing (pp. 111-117). ACM.

- Deng, J., Ye, Z., Shan, L., You, D., & Liu, G. (2022). Imputation method based on collaborative filtering and clustering for the missing data of the squeeze casting process parameters. Integrating Materials and Manufacturing Innovation, 11(1), 95-108.

- Zhang, Y., Chen, L., & Liang, Y. (2019). Deep learning-based imputation for missing values in high-dimensional data sets: A review. Neurocomputing, 370, 77-87.

- Wang, X., Liu, J., & Liang, Y. (2021). Collaborative Filtering with Matrix Factorization and Imputation Techniques for Sparse Data Sets. IEEE Access, 9, 10762-10772.

- Wu, C.-S. M., Garg, D., & Bhandary, U. (2018). Movie recommendation system using collaborative filtering. In 2018 IEEE 9th International Conference on Software Engineering and Service Science (ICSESS) (pp. 11-15). IEEE. DOI: 10.1109/ICSESS.2018.8663822.

- Petrozziello, A., Jordanov, I., & Sommeregger, C. (2018). Distributed neural networks for missing big data imputation. In 2018 International Joint Conference on Neural Networks (IJCNN) (pp. 1-8). IEEE. DOI: 10.1109/IJCNN.2018.8489488

- S. Owen and S. Owen, "Mahout in action," 2012.

- G. Research, "Movielens Dataset 100K," 1998.

- N. Rastin and M. Z. Jahromi, "Using content features to enhance performance of user-based collaborative filtering performance of user-based collaborative filtering," arXiv preprint arXiv:1402.2145, 2014.

- J. Renuka. (2016). Accuracy, Precision, Recall & F1 Score: Interpretation of Performance Measures.

- A. Morad, Hassan; Ali, Mansoor, Alsahaq; Salem, Alsseed, Alatrash, "A Novel Imputation-Boosted Technique to Overcome the Unrated Items Issue and Improving the Performance of Collaborative Filtering," presented at the 11th International Conference on Data Mining, Computers, Communication and Industrial Applications (DMCCIA), Kuala Lumpur, 2017.

- B. Sarwar, G. Karypis, J. Konstan, and J. Riedl, "Item-based collaborative filtering recommendation algorithms," in Proceedings of the 10th international conference on World Wide Web, 2001, pp. 285-295.

- M. S. G. Badrul, Karypis; Joseph, A. Konstan; John, T. Riedl, "Application of Dimensionality Reduction in Recommender System – A Case Study," Proc. KDD Workshop on Web Mining for e-Commerce: Challenges and Opportunities (WebKDD), ACM Press., pp. 1-12, 2000.

- D. M. R. N. Jasson, Srebro, "Fast maximum margin matrix factorization for collaborative prediction," in ICML '05 Proceedings of the 22nd international conference on Machine learning, Bonn, Germany, 2005, pp. 713-719.

- Y. Koren, R. Bell, and C. Volinsky, "Matrix factorization techniques for recommender systems," Computer, vol. 42, 2009.

- T. Luo, S. Chen, G. Xu, and J. Zhou, Trust-based collective view prediction: Springer, 2013.

- F. Ricci, L. Rokach, and B. Shapira, Introduction to recommender systems handbook: Springer, 2011.

- P. Resnick, N. Iacovou, M. Suchak, P. Bergstrom, and J. Riedl, "GroupLens: an open architecture for collaborative filtering of netnews," in Proceedings of the 1994 ACM conference on Computer supported cooperative work, 1994, pp. 175-186.

- T. Arsan, E. Köksal, and Z. Bozkus, "Comparison of Collaborative Filtering Algorithms with Various Similarity Measures for Movie Recommendation," International Journal of Computer Science, Engineering and Applications (IJCSEA), vol. 6, pp. 1-20, 2016.

- E. Rashid and E. Rashid, "Enhancing Software Fault Prediction With Machine Learning: Emerging Research and Opportunities," 2017.

- Y. Li, D. Wang, H. He, L. Jiao, and Y. Xue, "Mining intrinsic information by matrix factorization-based approaches for collaborative filtering in recommender systems," Neurocomputing, vol. 249, pp. 48-63, 2017.

- Q. Gu, J. Zhou, and C. Ding, "Collaborative filtering: Weighted nonnegative matrix factorization incorporating user and item graphs," in Proceedings of the 2010 SIAM International Conference on Data Mining, 2010, pp. 199-210.

- G. Chen, F. Wang, and C. Zhang, "Collaborative filtering using orthogonal nonnegative matrix tri-factorization," Information Processing & Management, vol. 45, pp. 368-379, 2009.

- C. Ding, X. He, and H. D. Simon, "On the equivalence of nonnegative matrix factorization and spectral clustering," in Proceedings of the 2005 SIAM International Conference on Data Mining, 2005, pp. 606-610.

- C. D. T. L. W. P. H. Park, "Orthogonal Nonnegative Matrix Tri-Factorizations for Clustering," presented at the Proceeding of the 2006 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2006.

- S. Zhang, W. Wang, J. Ford, and F. Makedon, "Learning from incomplete ratings using non-negative matrix factorization," in Proceedings of the 2006 SIAM International Conference on Data Mining, 2006, pp. 549-553.

- Y. Zhou, D. Wilkinson, R. Schreiber, and R. Pan, "Large-scale parallel collaborative filtering for the netflix prize," in International Conference on Algorithmic Applications in Management, 2008, pp. 337-348.

- H. Papadakis, C. Panagiotakis, and P. Fragopoulou, "SCoR: A Synthetic Coordinate based Recommender system," Expert Systems with Applications, vol. 79, pp. 8-19, 2017.

- D. Lemire and A. Maclachlan, "Slope one predictors for online rating-based collaborative filtering," in Proceedings of the 2005 SIAM International Conference on Data Mining, 2005, pp. 471-475.

التنزيلات

منشور

2023-03-04

إصدار

القسم

Article

كيفية الاقتباس

Morad Ali Hassan, & Mohamed Abdo ulwahad Alsharaa. (2023). A comparative of Imputation Techniques for Missing Data in Collaborative Filtering Using Apache Mahout . مجلة جامعة بني وليد للعلوم الإنسانية والتطبيقية, 8(1), 467-497. https://doi.org/10.58916/jhas.v8i1.502

الأعمال الأكثر قراءة لنفس المؤلف/المؤلفين

<< < 17 18 19 20 21 22 

المؤلفات المشابهة

1-10 من 76

يمكنك أيضاً إبدأ بحثاً متقدماً عن المشابهات لهذا المؤلَّف.