基于SVD的推荐系统研究及其应用

发布时间：2018-05-14 18:48

本文选题：推荐算法 + SVD++　；参考：《太原理工大学》2017年硕士论文

【摘要】：推荐系统是互联网高速发展的产物,在人们的生活、工作及学习中发挥着非常重要的作用。现如今,推荐系统已经在电商、电影、社交等领域获得飞速发展,国内外针对推荐系统的应用研究一直是近年的研究热点。推荐算法及其所依赖的大数据是推荐系统的核心,基于SVD的推荐技术可以针对推荐系统中用户-项目二元评分数据以及用户-项目-标签三元权值数据进行研究,是目前可以同时针对两种数据进行处理的关键且有效的算法。但是随着待处理信息的数据量不断增大,算法计算效率和推荐准确性成为推荐系统研究的关键。本文针对SVD技术在推荐系统应用中出现的计算效率低和推荐准确性不太理想的问题,分别对低阶和高阶SVD推荐算法性能进行了深入研究,本文所做的主要工作如下:1.首先,将基于SVD基本算法改进的LFM、Bias SVD和SVD++推荐算法的性能进行研究。其中LFM是将高维评分矩阵分解成两个低维用户和项目特征矩阵,Bias SVD算法是在LFM的基础上将用户和项目的基准信息加入模型,SVD++算法则是在Bias SVD算法之上又考虑了隐式信息。论文通过理论及实验分别对三个模型的性能进行了比较,结果表明,SVD++算法的计算准确性最好,但是计算效率最低;LFM算法的计算效率最高,但是准确性最差。2.其次,针对SVD++算法计算复杂度偏高导致的计算效率低问题进行了深入研究。分析SVD++算法理论模型发现,对预测模型目标函数的训练采用梯度下降法开展时,所用学习率函数性能直接影响模型训练所用迭代次数及收敛速度,因此本文提出了一种新学习率函数来对SVD++预测模型的特征参数进行学习,改进的学习率函数具有初始值大、中期下降迅速及后期值小并且缓慢变化的特点,实验证明,此方法在采用梯度下降法对SVD++算法模型进行训练的前提下,既能使SVD++推荐算法的计算效率明显提高,又能保证预测准确性不变。3.最后,本文针对基于用户-项目-标签三元数据的HOSVD推荐算法进行研究。在推荐系统里,用户-项目-标签数据会经常出现标签冗余现象,若能充分利用该特点,寻找标签与标签之间的关联性,对进一步提高预测效率非常有益。为此,本文提出了一种基于Apriori算法重组标签的HOSVD推荐算法,首先采用Apriori算法对原始标签数据进行预处理,寻找标签频繁项集,设定为新标签,并对标签进行编号,组成新的用户-项目-标签数据,再利用HOSVD算法对新组成的数据进行计算处理。通过实验,本文方法的推荐性能有了明显提高。
[Abstract]:Recommendation system is the product of the rapid development of the Internet. It plays a very important role in people's life, work and study. Nowadays, recommendation system has been developing rapidly in the fields of e-commerce, film, social interaction, etc. The research on the application of recommendation system at home and abroad has been a hot topic in recent years. The recommendation algorithm and its dependent big data are the core of the recommendation system. The recommendation technology based on SVD can be used to study the user-item binary score data and the user-project-label ternary weight data in the recommendation system. It is a key and effective algorithm which can deal with two kinds of data at the same time. However, with the increasing of the amount of information to be processed, the computational efficiency and recommendation accuracy of the algorithm become the key to the research of recommendation system. Aiming at the problems of low computing efficiency and low recommendation accuracy in the application of SVD technology in recommendation system, the performance of low-order and high-order SVD recommendation algorithms are studied in this paper. The main work of this paper is as follows: 1. Firstly, the performance of the improved LFM SVD Bias SVD and SVD recommendation algorithm is studied. LFM decomposes the high-dimensional scoring matrix into two low-dimensional users and the item feature matrix Bias SVD algorithm. On the basis of LFM, the benchmark information of users and items is added to the model. The algorithm is based on the Bias SVD algorithm and the implicit information is taken into account. The performance of the three models is compared in theory and experiment. The results show that the SVD algorithm has the best accuracy, but the LFM algorithm has the lowest computational efficiency, but the accuracy is the worst. Secondly, the problem of low computational efficiency caused by high computational complexity of SVD algorithm is studied in depth. By analyzing the theoretical model of SVD algorithm, it is found that the performance of the learning rate function directly affects the iterative times and convergence speed of the training of the model when the training of the objective function of the prediction model is carried out by gradient descent method. In this paper, a new learning rate function is proposed to study the characteristic parameters of the SVD prediction model. The improved learning rate function has the characteristics of large initial value, rapid decline in the middle period and small and slow change in the later period. On the premise of using gradient descent method to train the model of SVD algorithm, this method can not only improve the calculation efficiency of SVD recommendation algorithm, but also ensure the accuracy of prediction. 3. Finally, this paper studies the HOSVD recommendation algorithm based on user-item-tag ternary data. In the recommendation system, user-item-label data often appear label redundancy phenomenon. If we can make full use of this feature and find the correlation between label and label, it is very helpful to improve prediction efficiency. For this reason, this paper proposes a HOSVD recommendation algorithm based on Apriori algorithm to reorganize the label. Firstly, the Apriori algorithm is used to preprocess the original label data, to find the tag frequent itemsets, to set the label as a new label, and to number the label. The new user-item-label data is formed, and the newly formed data is calculated and processed by HOSVD algorithm. Through experiments, the recommended performance of this method has been improved obviously.
【学位授予单位】：太原理工大学
【学位级别】：硕士
【学位授予年份】：2017
【分类号】：TP391.3

【参考文献】