A comparative of Imputation Techniques for Missing Data in Collaborative Filtering Using Apache Mahout
الملخص
Recommender systems are a powerful tool that can be used to improve the user experience in a variety of applications. However, the issue of missing data in the user-item rating matrix is a common problem that affects the performance of these systems. To solve this problem, imputation techniques are used to estimate the missing values in the matrix. Apache Mahout is one of the popular open-source libraries that provide various algorithms for building recommender systems. It also provides an implementation of several imputation techniques to handle missing data in the user-item rating matrix. This paper aims to improve the accuracy and the performance of user-based collaborative filtering (UB-CF) by applying the imputation technique with Apache Mahout. The experiments are carried out on real world data sets Movielens. The results proved that our proposed method is effective in handling and identifying missing and noisy data in the user-item rating matrix. We demonstrate that our approach led to considerable enhancement compared with other previous approaches.