基于用户的协同过滤推荐算法中若干关键技术研究

发布时间：2018-06-18 03:31

本文选题：推荐系统 + 协同过滤　；参考：《安徽工业大学》2017年硕士论文

【摘要】：随着互联网及电子商务的迅速发展,在丰富了人们生活的同时,也带来了信息过载的问题。推荐系统是解决该问题的一种技术,它能为用户提供准确的、智能化、个性化的推荐服务。推荐技术关键的两步分别是为用户确定K近邻的个数和预测用户对项目的评分。首先,对于K近邻,即相似用户的个数,一般是根据经验或者多次实验来确定。因此,现有方法存在主观性较强和过程比较繁琐等问题,影响推荐算法的准确度。其次,对于评分的预测:存在多个近邻时,若采用余弦、皮尔逊等经典相似度计算,用户间的相似度值将大多为1,此时再利用传统的预测方法计算,其结果大都是近似于用户的均值;当只有一个近邻对目标项目打过分时,用户间的相似度对最终评分预测值的贡献度为零,预测结果都是目标用户的打分均值,对用户的偏好区分度不高。针对以上问题,本文在基于用户协同过滤算法的基础上对最近邻的选择和评分值预测方法进行深入的分析和研究,并分别建立了K近邻优化模型和提出了改进的评分值预测方法,主要内容可归纳如下:(1)基于差分进化算法的最近邻优化方法该方法首先结合用户实际打分和预测分值,以最小化平均绝对误差作为目标函数建立优化模型,然后通过差分优化算法计算出最优结果。最后利用平均绝对误差、准确度和召回率三个指标验证了新方法的优越性。新方法打破了传统最近邻通过人为设定相似度阈值的局限,可通过差分优化算法快速找到最优的K值。(2)基于SlopeOne算法的改进预测方法该方法在传统评分预测方法的基础上借鉴SlopeOne算法的思想,充分考虑当前用户和最近邻用户共同打分情况的同时,并融合相似度来体现不同近邻用户对预测当前用户评分行为的贡献度,设计了一种改进的评分值预测算法。新的方法有效解决了传统评分预测方法对用户的偏好区分度不高、没有充分利用用户评分信息、将最近邻用户相同对待等问题。本文区分冷启动和非冷启动两种情况,在MovieLens、Epinions、Netflix三个经典数据集上验证了所提出的两个新方法的性能。新方法在MAE、准确度和召回率上比传统的预测方法具有明显的优势,显著提高了基于用户的协同过滤推荐算法的准确度和推荐质量。本文提出的两种方法适用于冷启动和非冷启动两种环境,与现有推荐系统集成度高,应用推广价值较高。
[Abstract]:With the rapid development of internet and e-commerce, it not only enriches people's life, but also brings the problem of information overload. Recommendation system is a technology to solve this problem. It can provide users with accurate, intelligent and personalized recommendation service. The two key steps of recommendation technology are to determine the number of K-nearest neighbors for users and to predict the evaluation of items by users. First of all, the number of K nearest neighbors, that is, the number of similar users, is generally determined by experience or multiple experiments. Therefore, the existing methods have some problems, such as subjectivity and tedious process, which affect the accuracy of the recommendation algorithm. Secondly, for the prediction of score: when there are more than one nearest neighbor, if the classical similarity calculation such as cosine and Pearson are used, the similarity value between users will be mostly 1, and then the traditional prediction method is used to calculate the similarity between users. The results are mostly approximate to the average value of the user. When only one neighbor overdoes the target item, the contribution of the similarity between the users to the final score prediction value is zero, and the prediction results are the mean value of the target user. The degree of preference to users is not high. Aiming at the above problems, this paper makes a deep analysis and research on the nearest neighbor selection and score prediction method based on the user collaborative filtering algorithm. The K-nearest neighbor optimization model and the improved score prediction method are established respectively. The main contents can be summarized as follows: (1) the nearest neighbor optimization method based on differential evolutionary algorithm (DEA). The optimization model is established with minimizing the mean absolute error as the objective function, and then the optimal result is calculated by the difference optimization algorithm. Finally, the superiority of the new method is verified by three indexes: average absolute error, accuracy and recall rate. The new method breaks the limitation of traditional nearest neighbor by artificially setting similarity threshold. The improved prediction method based on SlopeOne algorithm. Considering the current users and nearest neighbor users scoring together, and combining similarity to reflect the contribution of different nearest neighbor users to the prediction of current users' rating behavior, an improved scoring prediction algorithm is designed. The new method effectively solves the problem that the traditional scoring prediction method has not high discrimination of users' preference, does not make full use of the user's scoring information, and treats the nearest neighbor users the same. This paper distinguishes between cold start and non cold start, and verifies the performance of the proposed two new methods on the three classical data sets of Movie Lenson Epinion / Netflix. The new method has obvious advantages over the traditional prediction method in mae accuracy and recall rate and improves the accuracy and quality of the user-based collaborative filtering recommendation algorithm. The two methods proposed in this paper are suitable for cold start and non cold start, and have high integration with the existing recommendation system and high application value.
【学位授予单位】：安徽工业大学
【学位级别】：硕士
【学位授予年份】：2017
【分类号】：TP391.3

【参考文献】