基于云模型和用户聚类的协同过滤推荐算法研究
发布时间:2018-01-21 15:22
本文关键词: 协同过滤 多维相似度 模糊聚类 云模型 出处:《华中科技大学》2016年硕士论文 论文类型:学位论文
【摘要】:随着互联网技术的快速发展,数据呈现爆炸式增长,信息过载问题越来越引人注目。协同过滤推荐技术在解决信息过载问题方面已经取得了不错的效果,但在实际应用中随着用户和项目的增多,数据稀疏性和扩展性等问题仍然制约了算法的性能,这些问题成为该领域的研究热点问题,具有很好的研究价值。因此,如何有效缓解基于协同过滤算法推荐系统中的数据稀疏性等问题、进一步提高推荐系统的预测准确度是本课题研究的主要目标。聚类技术常用于推荐系统中对用户进行聚类,挖掘用户的相似群体,进而有效的寻找合理的相似近邻集合,从而提高预测准确度。因此,针对传统Fuzzy C-Means算法对初始点敏感,易陷入局部最优解的缺陷进行了改进,提出了一种改进的模糊聚类算法(SoMKfcm算法)。首先,提出了一种初始聚类中心选择策略,有效避免噪音数据点的影响;其次,目标函数结合了样本加权和样本聚类中心距离,增加样本属性的非均衡性;最终对迭代求解过程进行优化,结合了模拟退火算法,加入了求解的随机跳跃性,避免结果陷入局部最优解。在MATLAB平台基于真实数据集上实验结果表明,与传统的算法相比,SoMKfcm算法具有更好的聚类效果和较好聚类准确度,并有效的改善传统算法的缺陷。在上述工作基础上,基于评分数据和用户个人信息数据,提出了一种结合云模型和用户特征聚类的推荐算法(CCCF算法)。首先,利用用户个人信息和云模型逆向云算法来重构评分数据,生成用户融合行为偏好向量。其次,在融合行为偏好矩阵的基础上利用SoMKfcm方法对用户进行模糊聚类,给出了重要性群体选择策略,为后续步骤提供数据平滑和近邻用户集的选择,进而提出了一种多维相似度计算方法。最后,基于上述结果进行评分预测。为了验证CCCF推荐算法的有效性,本文在Moveilens 1m和Moveilens 100k数据集上与其他几种相关算法进行对比实验。实验结果表明:在不同稀疏度情况下,CCCF算法能够有效缓解数据稀疏性对推荐算法的影响,算法预测准确度得到明显提高。
[Abstract]:With the rapid development of Internet technology, data showing explosive growth, the problem of information overload is becoming more and more noticeable. The collaborative filtering technology in solving the problem of information overload has achieved good results, but in the actual application, with the increase of users and items, data sparsity and scalability problems still restrict the performance of the algorithm. These problems have become the hot issues in the field, it has great research value. Therefore, how to effectively ease the collaborative filtering recommendation system based on the data sparseness problem, further improve the prediction accuracy of the recommendation system is the main goal of this research. Clustering techniques are commonly used in Recommendation System for users clustering. A similar group of mining user, thus effectively find reasonable similar neighbor sets, so as to improve the prediction accuracy. Therefore, the traditional Fuzzy C -Means algorithm is sensitive to the initial point and easy to fall into the local optimal solution of the defects are improved, the paper puts forward an improved fuzzy clustering algorithm (SoMKfcm algorithm) is proposed. Firstly, an initial clustering center selection strategy, effectively avoid the effect of noise data points; secondly, the function of target distance weighted sample and sample the clustering center, increase non balanced sample attribute; the final of the iterative process is optimized, combined with simulated annealing algorithm, adds a random jump for the results to avoid falling into local optimal solution. In the MATLAB platform based on real data sets. The experimental results show that compared with the traditional algorithm, SoMKfcm algorithm has better clustering effect and better clustering accuracy, and improve the traditional algorithm defects. Based on the above work, the score data and the user's personal information based on the data, proposes a combination of cloud Recommendation algorithm and user clustering model (CCCF algorithm). Firstly, the personal information of the user and cloud model using reverse cloud algorithm to reconstruct the score data fusion to generate user behavior preference vector. Secondly, based on the fusion behavior preference matrix using SoMKfcm method for users of fuzzy clustering, the importance of group selection strategy is given. Provide data smoothing and neighbor users set selection for subsequent steps, and then proposes a multidimensional similarity calculation method. Finally, based on the results of rating prediction. In order to verify the validity of the CCCF recommendation algorithm, this paper is related to several other Moveilens 1m and Moveilens 100k data sets algorithm. The results of experiments. Under different sparsity conditions, CCCF algorithm can effectively alleviate the influence of data sparsity recommendation algorithm, algorithm prediction accuracy is significantly improved.
【学位授予单位】:华中科技大学
【学位级别】:硕士
【学位授予年份】:2016
【分类号】:TP391.3
【参考文献】
相关期刊论文 前4条
1 张峻玮;杨洲;;一种基于改进的层次聚类的协同过滤用户推荐算法研究[J];计算机科学;2014年12期
2 吴湖;王永吉;王哲;王秀利;杜栓柱;;两阶段联合聚类协同过滤算法[J];软件学报;2010年05期
3 李政伟;谭国俊;;改进的退火遗传优化策略应用研究[J];计算机工程与应用;2010年04期
4 张光卫;李德毅;李鹏;康建初;陈桂生;;基于云模型的协同过滤推荐算法[J];软件学报;2007年10期
,本文编号:1451836
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/1451836.html