基于用户评分和遗传算法的协同过滤推荐算法
发布时间:2018-12-17 06:14
【摘要】:随着互联网的迅速发展,人们的生活发生了翻天覆地的巨大变化,但是如何从庞大的信息中找到自己需要的也变得越来越难。在这种背景下,推荐系统应运而生了,并且发挥了巨大作用;推荐系统在减少很多网站存在的信息过载问题所带来的诸多负面影响方面发挥了越来越重要的作用,而在这些网站上,用户往往很有可能通过评分投票的方式表达出他们对一系列物品或者服务的喜好。协同过滤推荐算法是目前广泛使用的一种推荐技术。它分析用户兴趣,在用户群中找到指定用户的相似(兴趣)用户,综合这些相似用户对某一信息的评价,形成系统对该指定用户对此信息的喜好程度的预测。常用的相似性计算方法有余弦相似性、Pearson相关系数等方法,但这些相似性计算方法通常公式比较复杂,这样就导致推荐过程中的相似性计算耗时过多,降低推荐效率。本文将提出一种新的相似性计算方法,该方法基于遗传算法和用户评分信息。首先,提出一个向量元素的个数为C-c+1(例如C=5,c=1,元素个数为5)。表示两个用户x,y对同一个物品评分的评分差为i出现的次数a与同时都被这两个用户评过分的物品的个数b的比值。其次,提出一个权重向量元素个数是C-c+l。每个元素q(i)的值在[-1,1]之间。每个元素q(i)用来衡量px,y(i)对于计算两个用户之间相似性的重要程度。由这两个向量构成新的相似性计算方法。其中最佳权重向量通过遗传算法来得到。最后,将上面新的相似性计算方法在FilmAffinity两个数据集进行实验。通过训练集得到推荐模型,然后将种群中的个体q运用到训练集中进Movielens行预测推荐,得到该个体q对应的系统MAE如果小于给定的阈值,那么该个体就是最佳个体,将其运用到测试集中进行性能测试。通过实验比较性能指标,在推荐系统中,本方法在预测、推荐质量等方面与传统方法相比有一定提高,并且推荐效率也有一定的提升。
[Abstract]:With the rapid development of the Internet, people's lives have undergone tremendous changes, but how to find their own needs from the huge information becomes more and more difficult. In this context, recommendation system emerged as the times require, and played a great role; Recommendation systems are playing an increasingly important role in reducing the many negative effects of information overload problems on many websites, Users are more likely to express their preference for a range of goods or services by voting on ratings. Collaborative filtering recommendation algorithm is a widely used recommendation technology. It analyzes the interest of the user, finds the similar user in the user group, synthesizes the evaluation of the information by these similar users, and forms the prediction of the system's preference for the information. The common methods of similarity calculation are cosine similarity and Pearson correlation coefficient, but the formulas of these similarity calculation methods are usually complicated, which leads to the time-consuming calculation of similarity in the process of recommendation and the reduction of recommendation efficiency. In this paper, a new similarity calculation method is proposed, which is based on genetic algorithm and user scoring information. First of all, we propose that the number of vector elements is C-c1 (for example, the number of elements is 5). The difference between the two users' scores of the same item is the ratio of the number of times I appears a and the number of items overrated by the two users at the same time b. Secondly, it is proposed that the number of weight vector elements is C-cl. The value of each element q (i) is between [- 1]. Each element q (i) is used to measure the importance of px,y (i) in calculating the similarity between two users. The two vectors constitute a new similarity calculation method. The optimal weight vector is obtained by genetic algorithm. Finally, the new similarity calculation method is applied to the two data sets of FilmAffinity. The recommendation model is obtained from the training set, and then the individual Q of the population is applied to the training set to predict the recommendation in the Movielens row. If the system MAE corresponding to the individual Q is less than the given threshold, then the individual is the best individual. Apply it to the test set for performance testing. Compared with the traditional methods, the performance index of this method is improved and the efficiency of recommendation is also improved in the recommendation system.
【学位授予单位】:湖南大学
【学位级别】:硕士
【学位授予年份】:2016
【分类号】:TP391.3
,
本文编号:2383789
[Abstract]:With the rapid development of the Internet, people's lives have undergone tremendous changes, but how to find their own needs from the huge information becomes more and more difficult. In this context, recommendation system emerged as the times require, and played a great role; Recommendation systems are playing an increasingly important role in reducing the many negative effects of information overload problems on many websites, Users are more likely to express their preference for a range of goods or services by voting on ratings. Collaborative filtering recommendation algorithm is a widely used recommendation technology. It analyzes the interest of the user, finds the similar user in the user group, synthesizes the evaluation of the information by these similar users, and forms the prediction of the system's preference for the information. The common methods of similarity calculation are cosine similarity and Pearson correlation coefficient, but the formulas of these similarity calculation methods are usually complicated, which leads to the time-consuming calculation of similarity in the process of recommendation and the reduction of recommendation efficiency. In this paper, a new similarity calculation method is proposed, which is based on genetic algorithm and user scoring information. First of all, we propose that the number of vector elements is C-c1 (for example, the number of elements is 5). The difference between the two users' scores of the same item is the ratio of the number of times I appears a and the number of items overrated by the two users at the same time b. Secondly, it is proposed that the number of weight vector elements is C-cl. The value of each element q (i) is between [- 1]. Each element q (i) is used to measure the importance of px,y (i) in calculating the similarity between two users. The two vectors constitute a new similarity calculation method. The optimal weight vector is obtained by genetic algorithm. Finally, the new similarity calculation method is applied to the two data sets of FilmAffinity. The recommendation model is obtained from the training set, and then the individual Q of the population is applied to the training set to predict the recommendation in the Movielens row. If the system MAE corresponding to the individual Q is less than the given threshold, then the individual is the best individual. Apply it to the test set for performance testing. Compared with the traditional methods, the performance index of this method is improved and the efficiency of recommendation is also improved in the recommendation system.
【学位授予单位】:湖南大学
【学位级别】:硕士
【学位授予年份】:2016
【分类号】:TP391.3
,
本文编号:2383789
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/2383789.html