基于聚类的个性化推荐算法研究
发布时间:2018-09-12 15:54
【摘要】:随着网络资源的不断增长,个性化推荐系统成为网络资源查询的一种重要的工具,一方面,它可以帮助网络用户节省网络资源搜寻的时间开销;另一方面,它可以使网络用户在参与度较低的情况下实现满意的网络资源查找。个性化推荐系统作为目前的研究热点,国内外学者对其进行了大量研究,也取得了很大进步,但还是存在诸多问题。本文针对个性化推荐系统中存在的冷启动、准确度低等问题,分析比较了目前常用个性化推荐算法优缺点,利用大数据对用户基本特征属性元的权重进行分析,实现对新用户行为偏好的合理预测,并设计一种基于用户的MI(Multiple Instance)聚类算法,提出用户特征相似度、项目基本特征与项目评分相似度三者加权求和的综合相似度计算方法,在主观客观偏差降到最低的基础上设计了加权因子的分配方法,通过实验验证了其缓解冷启动问题和提高推荐准确度的有效性和优越性。针对数据稀疏问题,本文通过用户信息特征将相似用户进行聚类,为后续项目评分数据统计平均的进行提供一个有效可信的计算范围,再将簇内项目评分数据的统计平均值替换缺损值,最后的实验也表明了此方法对于解决数据稀疏问题的有效性。本文实验数据集采用的是MovieLens-ml-100k,该数据集包括了训练集和测试集等,本文最后应用该数据集对本文所提算法进行实验分析,验证了本文算法的正确性和优越性。
[Abstract]:With the continuous growth of network resources, personalized recommendation system has become an important tool for network resource query. On the one hand, it can help network users to save the time cost of searching network resources; on the other hand, It can make network users realize satisfactory network resource search under the condition of low participation. Personalization recommendation system is a research hotspot at present. Scholars at home and abroad have done a lot of research on it, and have made great progress, but there are still many problems. Aiming at the problems of cold start and low accuracy in the personalized recommendation system, this paper analyzes and compares the advantages and disadvantages of the commonly used personalized recommendation algorithms, and uses big data to analyze the weight of the user's basic characteristic attribute elements. To realize reasonable prediction of new user's behavior preference, and design a user-based MI (Multiple Instance) clustering algorithm, and put forward a comprehensive similarity calculation method, which is weighted summation of user feature similarity, item basic feature and item score similarity. On the basis of minimizing subjective and objective deviations, a weighting factor allocation method is designed, and its effectiveness and superiority in alleviating cold start problem and improving recommendation accuracy are verified by experiments. Aiming at the problem of data sparsity, this paper clusters similar users through user information features, which provides an effective and reliable calculation range for the statistical average of the subsequent item scoring data. Then the statistical average of the item score data in the cluster is replaced by the defect value. Finally, the experimental results show that this method is effective in solving the problem of data sparsity. The experimental data set in this paper uses MovieLens-ml-100k, which includes the training set and the test set, etc. Finally, the algorithm proposed in this paper is experimentally analyzed by using the data set, and the correctness and superiority of the proposed algorithm are verified.
【学位授予单位】:昆明理工大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP391.3
[Abstract]:With the continuous growth of network resources, personalized recommendation system has become an important tool for network resource query. On the one hand, it can help network users to save the time cost of searching network resources; on the other hand, It can make network users realize satisfactory network resource search under the condition of low participation. Personalization recommendation system is a research hotspot at present. Scholars at home and abroad have done a lot of research on it, and have made great progress, but there are still many problems. Aiming at the problems of cold start and low accuracy in the personalized recommendation system, this paper analyzes and compares the advantages and disadvantages of the commonly used personalized recommendation algorithms, and uses big data to analyze the weight of the user's basic characteristic attribute elements. To realize reasonable prediction of new user's behavior preference, and design a user-based MI (Multiple Instance) clustering algorithm, and put forward a comprehensive similarity calculation method, which is weighted summation of user feature similarity, item basic feature and item score similarity. On the basis of minimizing subjective and objective deviations, a weighting factor allocation method is designed, and its effectiveness and superiority in alleviating cold start problem and improving recommendation accuracy are verified by experiments. Aiming at the problem of data sparsity, this paper clusters similar users through user information features, which provides an effective and reliable calculation range for the statistical average of the subsequent item scoring data. Then the statistical average of the item score data in the cluster is replaced by the defect value. Finally, the experimental results show that this method is effective in solving the problem of data sparsity. The experimental data set in this paper uses MovieLens-ml-100k, which includes the training set and the test set, etc. Finally, the algorithm proposed in this paper is experimentally analyzed by using the data set, and the correctness and superiority of the proposed algorithm are verified.
【学位授予单位】:昆明理工大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP391.3
【参考文献】
相关期刊论文 前10条
1 周海平;黄凑英;刘妮;周洪波;;基于评分预测的协同过滤推荐算法[J];数据采集与处理;2016年06期
2 马胡双;石永革;高胜保;;基于特征增益与多级优化的协同过滤个性化推荐算法[J];科学技术与工程;2016年21期
3 王瑞琴;蒋云良;李一啸;楼俊钢;;一种基于多元社交信任的协同过滤推荐算法[J];计算机研究与发展;2016年06期
4 周明建;赵建波;李腾;;基于情境相似的知识个性化推荐系统研究[J];计算机工程与科学;2016年03期
5 杨武;唐瑞;卢玲;;基于内容的推荐与协同过滤融合的新闻推荐方法[J];计算机应用;2016年02期
6 吴毅涛;张兴明;王兴茂;李晗;;基于用户模糊相似度的协同过滤算法[J];通信学报;2016年01期
7 王梦恬;魏晶晶;廖祥文;林锦贤;陈国龙;;融合评论标签的个性化推荐算法[J];计算机科学与探索;2016年10期
8 叶锡君;龚s,
本文编号:2239506
本文链接:https://www.wllwen.com/kejilunwen/sousuoyinqinglunwen/2239506.html