当前位置:主页 > 经济论文 > 电子商务论文 >

基于用户聚类的协同过滤推荐算法研究

发布时间:2019-03-07 22:18
【摘要】:随着互联网的普及,大量无意义的数据给人们筛选有效信息带来巨大的困难。为了帮助人们快速有效的筛选信息,个性化推荐系统应运而生。推荐算法作为推荐系统的核心,一直是研究的重点。在众多的推荐算法中,协同过滤算法是应用最广泛的。协同过滤算法通过对用户历史行为数据的挖掘发现用户的偏好,从而基于不同的偏好对用户进行群组划分并推荐品味相似的物品。然而,随着电子商务系统中用户数和项目数的不断增大,数据的稀疏性和推荐效率逐渐成为制约协同过滤算法发展的瓶颈。为了提高协同过滤算法的推荐质量和推荐效率,本文提出一种基于改进的用户聚类协同过滤推荐算法,并基于改进算法设计和实现了一个B/S架构的电影推荐系统。本文介绍了个性化推荐系统的发展背景和架构设计,给出了传统协同过滤算法的基本思想和面临的主要问题,从而从离线用户聚类和用户相似度计算两个方面改进了传统算法。对算法的改进主要体现在两个方面:一是综合考虑了用户评分信息和项目类别偏好信息对用户聚类的影响,提出一种联合用户聚类算法。该算法分别基于用户评分信息和项目类别偏好信息对基本用户聚类,产生两个聚类中心和两个用户类别所属矩阵,计算目标用户与两个聚类中心的相似度以及目标用户在不同聚类中所属的类簇,对结果合并去重后得到目标用户的最近邻居搜索空间。二是针对传统Pearson相关系数计算相似度时对绝对数值不敏感等问题,提出一种基于差异因子的加权Pearson相关系数计算方法,将评分差异因子作为权重来修正传统的Pearson相关系数。采用MovieLens数据集,以MAE值、准确率、召回率和F1值为度量标准,通过多组实验对改进算法、传统基于用户的协同过滤算法(CF)、传统基于用户聚类的协同过滤算法(UCCF)进行评估,实验结果表明改进算法能够有效提高推荐系统的推荐效率和推荐精度。本文基于改进算法设计并实现了电影推荐系统,系统采用豆瓣Top250电影信息作为数据集,使用PHP和Matlab混合编程实现,能够根据用户的偏好信息为用户提供个性化的推荐服务。
[Abstract]:With the popularity of the Internet, a large number of meaningless data for people to screen effective information has brought great difficulties. In order to help people to screen information quickly and effectively, personalized recommendation system emerges as the times require. As the core of recommendation system, recommendation algorithm has always been the focus of research. Among the many recommendation algorithms, collaborative filtering algorithm is the most widely used. The collaborative filtering algorithm discovers the user's preference by mining the user's historical behavior data, and then groups the users based on different preferences and recommends the items with similar taste. However, with the increasing number of users and items in e-commerce system, the sparsity of data and the efficiency of recommendation gradually become the bottleneck to restrict the development of collaborative filtering algorithm. In order to improve the recommendation quality and efficiency of collaborative filtering algorithm, this paper proposes an improved collaborative filtering recommendation algorithm based on user clustering, and designs and implements a movie recommendation system based on the improved algorithm. This paper introduces the development background and architecture design of personalized recommendation system, gives the basic idea and main problems of traditional collaborative filtering algorithm, and improves the traditional algorithm from two aspects: off-line user clustering and user similarity calculation. The improvement of the algorithm is mainly reflected in two aspects: first, considering the influence of user rating information and item class preference information on user clustering, a joint user clustering algorithm is proposed. Based on the user rating information and item class preference information, the algorithm clusters the basic users respectively, and generates two clustering centers and two user categories belong to the matrix. The similarity between the target user and the two clustering centers and the cluster belonging to the target user in different clusters are calculated. The nearest neighbor search space of the target user is obtained after the result is merged and deduplicated. Secondly, aiming at the problem that the traditional Pearson correlation coefficient is insensitive to absolute value when calculating the similarity degree, a weighted Pearson correlation coefficient calculation method based on the difference factor is proposed, in which the score difference factor is used as the weight to correct the traditional Pearson correlation coefficient. Using MovieLens data set and mae value, accuracy rate, recall rate and F1 value as metrics, the improved algorithm is improved by multi-group experiments, and the traditional user-based collaborative filtering algorithm (CF), is used. The traditional collaborative filtering algorithm based on user clustering (UCCF) is evaluated. The experimental results show that the improved algorithm can effectively improve the recommendation efficiency and accuracy of the recommendation system. In this paper, a movie recommendation system is designed and implemented based on the improved algorithm. The system uses Douban Top250 movie information as data set and mixed programming with PHP and Matlab, which can provide personalized recommendation service to users according to their preference information.
【学位授予单位】:北京交通大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP391.3

【参考文献】

中国期刊全文数据库 前10条

1 王宇飞;宋俊典;戴炳荣;;基于用户评分和项目类偏好的协同过滤推荐算法[J];软件导刊;2016年12期

2 陈功平;王红;;改进Pearson相关系数的个性化推荐算法[J];山东农业大学学报(自然科学版);2016年06期

3 张栩晨;;利用Tri-training算法解决推荐系统冷启动问题[J];计算机科学;2016年12期

4 魏慧娟;戴牡红;宁勇余;;基于最近邻居聚类的协同过滤推荐算法[J];中国科学技术大学学报;2016年09期

5 王兴茂;张兴明;吴毅涛;潘俊池;;基于启发式聚类模型和类别相似度的协同过滤推荐算法[J];电子学报;2016年07期

6 黄涛;黄仁;张坤;;一种改进的协同过滤推荐算法[J];计算机科学;2016年S1期

7 邱爽;葛万成;汪亮友;林佳燕;;个性化推荐中基于用户协同过滤算法的优化[J];信息技术;2016年03期

8 赵宏晨;翟丽丽;张树臣;;基于灰色关联度聚类与标签重叠因子结合的协同过滤推荐方法研究[J];计算机工程与科学;2016年01期

9 原福永;马琳;梁顺攀;;融合用户相似度和信任传播重组信任矩阵算法[J];燕山大学学报;2015年06期

10 李艳萍;刘明;于丽梅;;个性化信息服务网络系统架构研究[J];数字技术与应用;2015年09期

中国硕士学位论文全文数据库 前1条

1 蒲彬;基于社交信号的个性化新闻推荐系统的设计与实现[D];中国科学院大学(工程管理与信息技术学院);2015年



本文编号:2436505

资料下载
论文发表

本文链接:https://www.wllwen.com/jingjilunwen/dianzishangwulunwen/2436505.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户54ba6***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com