基于协同过滤和加权二部图的推荐算法研究

发布时间:2018-01-05 23:00

  本文关键词:基于协同过滤和加权二部图的推荐算法研究 出处:《吉林大学》2017年硕士论文 论文类型:学位论文


  更多相关文章: 协同过滤 数据预处理 层次聚类 kmeans++ 加权二部图 预测评分


【摘要】:随着互联网数据的急剧增长,信息过载问题日益严重,推荐是解决信息过载的主要方式之一。然而传统的协同过滤推荐算法存在推荐效率不高,准确率低,内存溢出等问题;另外随着数据量的增加数据稀疏越来越严重,传统协同过滤推荐结果越发不精确。为了解决上述问题,提高推荐算法的效率和精度,本文做了一定的改进,具体工作包括以下几个方面。首先,对数据预处理方法进行了改进,使用时间衰减曲线可以更加契合用户的兴趣随着时间变化这一事实,更加逼近用户真实兴趣;采用高斯规范化处理原始数据,去除了由于个人因素造成的评分标准不统一的问题,预处理后数据更加的规范和标准。其次,使用改进的层次聚类算法和Kmeans算法对大量的用户划分聚类,将用户划分为若干个相似的用户类。该方法对聚类距离重新定义,既考虑共同评分项对用户相似度的影响,又兼顾公众项及用户评分重合度的影响,使得聚类结果更加符合真实情况,同时也对用户兴趣社区发现奠定了基础。最后,在聚类的基础上对每个用户子类和对应的项目子类重新建模,用二部图的思想对未评分项进行预测评分。本文在二部图算法基础上进行了一定的改进,提出加权二部图算法,该算法明显降低了矩阵运算复杂度,提高了预测精准度,明显提高了运算推荐效率,为以后做实时推荐提供了理论基础。经过以上三个方面的改进,本文提出了基于加权二部图及划分聚类相结合的协同过滤推荐算法。该算法能够较好的处理传统算法的不足,缓解了数据稀疏、推荐准确率较低、内存溢出等问题,同时提高了算法的效率和精度。最后使用Movielens数据集,通过具体的编码试验验证,结果证明,优化后的算法在MAE、RMSE、推荐准确度以及算法效率等方面都得到了较好的结果。
[Abstract]:With the rapid growth of Internet data, the growing problem of information overload, recommendation is one of the main ways to solve the problem of information overload. However, traditional collaborative filtering recommendation algorithm has recommended the efficiency is not high, the rate of accuracy is low, the memory overflow problem; also with the increased amount of data sparse data more and more serious, the traditional collaborative filtering recommendation results more not accurate. In order to solve the above problems, improve the efficiency and accuracy of the algorithm, this paper made some improvements, the specific work includes the following aspects. Firstly, the data preprocessing method was improved. The attenuation curve can be more fit the user's interest with the fact that the time changes, more close to the real user interest; Gauss used standardized processing of raw data, removal of the standard for evaluation due to personal factors are not uniform, after preprocessing the data more Norms and standards. Secondly, the user clustering is the use of a large number of improved hierarchical clustering algorithm and Kmeans algorithm, the user will be divided into a number of similar users. The method re definition of clustering distance, both common rating items affect user similarity, but also affect the public and user rating of coincidence degree. Makes the clustering results more in line with the real situation, but also found that laid the foundation of interest to the user community. Finally, users and the corresponding items for each sub class re modeling based on the cluster, predict the score of rating items with two figure of thought. This paper makes some improvements in the two figure based on the proposed algorithm, two weighted graph algorithm, this algorithm significantly reduces the computational complexity of matrix, improve the prediction accuracy, significantly improves the operation efficiency for the real-time recommendation, Recommendation provides The theoretical basis. Through the improvement of the above three aspects, this paper proposes a collaborative filtering recommendation algorithm based on weighted two graph and clustering based on combination. The algorithm can lack the traditional algorithm better, alleviate the data sparsity, the recommendation accuracy is low, the memory overflow problem, and improve the efficiency and precision of the algorithm using the Movielens data set. Finally, through the encoding test, the results show that the optimized algorithm in MAE, RMSE, recommendation accuracy and efficiency of the algorithm and the reasonable results are obtained.

【学位授予单位】:吉林大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP391.3

【参考文献】

相关期刊论文 前7条

1 申健;柴艳娜;;Web搜索引擎技术研究[J];计算机技术与发展;2016年12期

2 李青淋;邵家玉;;PageRank算法的研究与改进[J];工业控制计算机;2016年05期

3 陈洁敏;汤庸;李建国;蔡奕彬;;个性化推荐算法研究[J];华南师范大学学报(自然科学版);2014年05期

4 黄谭;苏一丹;;基于混合用户模型的二分图推荐算法[J];计算机技术与发展;2014年06期

5 葛芳晟;刘芳;;马尔科夫链理论的简易应用[J];科协论坛(下半月);2013年10期

6 李稚楹;杨武;谢治军;;PageRank算法研究综述[J];计算机科学;2011年S1期

7 吴颜;沈洁;顾天竺;陈晓红;李慧;张舒;;协同过滤推荐系统中数据稀疏问题的解决[J];计算机应用研究;2007年06期

相关博士学位论文 前1条

1 徐芳芳;矩阵补全的模型、算法和应用研究[D];上海交通大学;2014年



本文编号:1385192

资料下载
论文发表

本文链接:https://www.wllwen.com/shoufeilunwen/xixikjs/1385192.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户bc136***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com