当前位置:主页 > 科技论文 > 软件论文 >

基于Spark的混合推荐系统的研究与实现

发布时间:2018-06-27 05:07

  本文选题:推荐系统 + Spark平台 ; 参考:《北京交通大学》2017年硕士论文


【摘要】:在大数据时代背景下,推荐系统已经成为一个解决信息过载问题不可或缺的工具。一方面用户通过推荐系统在海量的数据信息中筛选有用信息,获得有力的决策支持。另一方面提供推荐服务的电商、多媒体服务商等希望通过推荐系统来对用户进行针对性的个性化营销以提高收益。近十年来推荐系统取得了突飞猛进的发展,但仍面临着诸多挑战和问题,例如海量数据的存储计算和扩展性问题,原生的数据稀疏性问题,以及缺乏推荐系统的时效性问题等等。为了解决上述问题,本文基于Spark平台研究并实现了一个针对电影领域的混合推荐系统。第一,研究了目前常用的矩阵因子分解方法,提出了一种混合了时间因子和邻域信息的混合矩阵分解推荐算法。将用户所在群体兴趣随时间迁移的因素考虑其中,并采用了动量梯度下降的方式求解损失函数,在参数求解速度提升的同时提高了算法的预测精确性;第二,针对协同过滤的相似度计算问题,提出了一种改进的皮尔逊系数相似度计算方法,考虑了物品的热度和个体评分偏置的影响。经实验证明,该计算方法有效的降低了算法的均方根误差;第三,针对推荐系统的时效性问题,本文采用了增量ALS矩阵分解算法。对于新获取的信息,局部的修改模型而避免对模型的重新训练,节省了巨大的计算花销。实验证明,增量ALS较目前流行的增量SGD具有更快的交互速度和更高的准确度,有效的提高了系统的反应速度;最后本文基于Spark平台设计并实现了一个电影推荐系统,包括了日志收集、数据处理和混合推荐引擎等主要模块,并融合了上述优化方法,有效的改善了目前推荐系统遇到的主要问题。
[Abstract]:Under the background of big data, recommendation system has become an indispensable tool to solve the problem of information overload. On the one hand, users filter useful information through recommendation system to obtain powerful decision support. On the other hand, ecommerce providers and multimedia service providers who provide recommendation services hope to use recommendation system to carry out targeted personalized marketing to improve revenue. In the past decade, the recommendation system has made great progress, but it still faces many challenges and problems, such as the storage, computation and expansibility of massive data, the sparsity of native data. And the lack of recommendation system timeliness and so on. In order to solve the above problems, this paper studies and implements a hybrid recommendation system for film field based on Spark platform. Firstly, the matrix factorization methods are studied, and a hybrid matrix factorization recommendation algorithm is proposed, which combines the time factor and neighborhood information. Considering the factor that the user's group interests migrate with time, the loss function is solved by decreasing the momentum gradient, which improves the prediction accuracy of the algorithm while improving the speed of solving the parameters. An improved method for calculating the similarity of Pearson coefficient is proposed to solve the problem of similarity calculation of collaborative filtering. The effects of heat and individual bias are considered. Experimental results show that the algorithm can effectively reduce the root mean square error. Thirdly, the incremental ALS matrix decomposition algorithm is used to solve the time-efficiency problem of recommendation system. For the newly acquired information, the local modification of the model avoids the re-training of the model and saves huge computational costs. Experimental results show that the incremental ALS has faster interaction speed and higher accuracy than the popular incremental SGD. Finally, this paper designs and implements a movie recommendation system based on Spark platform. Including log collection, data processing and hybrid recommendation engine and other major modules, and the integration of the above optimization methods, effectively improve the main problems encountered in the current recommendation system.
【学位授予单位】:北京交通大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP391.3

【参考文献】

相关期刊论文 前2条

1 龚灿;卢军;;基于Spark的实时情境推荐系统关键技术研究[J];电子测试;2016年Z1期

2 车晋强;谢红薇;;基于Spark的分层协同过滤推荐算法[J];电子技术应用;2015年09期



本文编号:2072776

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/2072776.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户9416b***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com