当前位置:主页 > 科技论文 > 软件论文 >

结合时间序列的协同主题回归推荐算法研究

发布时间:2018-04-15 17:41

  本文选题:协同过滤 + 概率矩阵分解模型 ; 参考:《内蒙古大学》2017年硕士论文


【摘要】:随着信息过载的产生,在越来越开放的互联网中,想要获取我们真正需要的信息变得越来越困难,个性化推荐的出现有效地解决了信息过载的问题,主动为用户推荐其感兴趣的信息和商品。协同过滤是个性化推荐中使用最广泛的方法,但由于协同过滤通常只将用户-项目评分作为推荐的唯一数据信息,因此存在冷启动、数据稀疏等问题。本文在一个结合主题模型(LDA)和矩阵分解模型(PMF)的分层贝叶斯模型——协同主题回归模型(CTR)的基础上,使用其中的主题模型对项目的标签信息进行处理,并对概率矩阵分解模型进行改进,不仅考虑用户-项目的评分信息,还将用户的信任关系、时间序列、项目的标签信息等其他对推荐具有影响的因素加入到模型中。用户可以根据好友及其信任用户的推荐选择自己感兴趣的商品,基于时间因素的用户评价先后关系也会对用户的选择产生影响,将时间序列对用户关系的影响与好友间的信任度线性融合并加入到PMF模型中,生成用户潜在特征向量。此外用户对项目定义的标签信息在一定程度上也可以反映用户的偏好,因此利用主题模型LDA处理项目的标签文本信息得到项目的潜在特征向量。最后将改进的LDA和PMF模型的特点融合在CTR模型中,根据CTR模型的原理提出N-CTR模型,并采用梯度下降方法和最大期望算法最优化用户、项目潜在特征矩阵和主题分布向量,进行评分预测。在Last.fm数据集上进行实验,实验结果显示混合了用户信任关系、时间序列、项目标签信息和评分数据等多因素的N-CTR模型的推荐准确率MAE和RMSE比只采用用户-项目评分数据的PMF模型分别提高了 7.36%和7.94%,说明本模型在一定程度上缓解了推荐过程中的数据稀疏问题且该模型比传统的协同过滤推荐算法准确率更高。
[Abstract]:With the emergence of information overload, in the more and more open Internet, it becomes more and more difficult to obtain the information we really need. The emergence of personalized recommendation effectively solves the problem of information overload.Actively recommend information and products of interest to users.Collaborative filtering is the most widely used method in personalized recommendation, but because collaborative filtering usually only takes user-item score as the only data information, there are some problems such as cold start and sparse data.In this paper, based on a hierarchical Bayesian model-cooperative theme regression model (CTRR), which combines the topic model (LDA) and the matrix decomposition model (PMF), we use the topic model to process the label information of the project.The probabilistic matrix decomposition model is improved to consider not only the user-item scoring information, but also the user's trust relationship, time series, item label information and other factors that affect the recommendation.Users can choose the items they are interested in according to the recommendation of their friends and trusted users, and the relationship of users' evaluation priority based on time factors will also have an impact on the choice of users.The influence of time series on user relationship and the trust degree among friends are linear fused and added to the PMF model to generate the potential feature vector of the user.In addition, the label information of the project definition can also reflect the user's preference to a certain extent, so the potential feature vector of the item can be obtained by using the topic model LDA to process the tag text information of the item.Finally, the features of the improved LDA and PMF models are fused into the CTR model. According to the principle of the CTR model, the N-CTR model is proposed, and the gradient descent method and the maximum expectation algorithm are used to optimize the users, the project potential feature matrix and the topic distribution vector.Predict the score.The experimental results on Last.fm data set show that the user trust relationship is mixed, and time series are used.The recommendation accuracy of N-CTR model with multiple factors, such as item label information and rating data, is 7.36% and 7.94% higher than that of PMF model which only uses user-item rating data. It shows that this model alleviates recommendation to some extent.The model is more accurate than the traditional collaborative filtering recommendation algorithm.
【学位授予单位】:内蒙古大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP391.3

【参考文献】

相关期刊论文 前5条

1 唐晓波;向坤;;基于LDA模型和微博热度的热点挖掘[J];图书情报工作;2014年05期

2 王振振;何明;杜永萍;;基于LDA主题模型的文本相似度计算[J];计算机科学;2013年12期

3 孙光福;吴乐;刘淇;朱琛;陈恩红;;基于时序行为的协同过滤推荐算法[J];软件学报;2013年11期

4 曾春,邢春晓,周立柱;个性化服务技术综述[J];软件学报;2002年10期

5 顾r,

本文编号:1755121


资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/1755121.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户53b52***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com