基于GPU的协同过滤推荐算法的设计与实现
发布时间:2018-02-27 14:54
本文关键词: 推荐系统 GPU CUDA 精确度 出处:《北京邮电大学》2013年硕士论文 论文类型:学位论文
【摘要】:随着网络信息的爆炸式发展而导致信息过载和搜索引擎系统本身的被动性搜索过程,推荐引擎系统受到了越来越多的关注和研究。推荐系统当前主要的研究方向是冷启动问题,矩阵稀疏问题以及推荐多样性问题等等,总体上是针对推荐结果的优劣进行研究和改进。但是由于推荐系统本身的巨大规模和矩阵稀疏性问题共同影响导致预测推荐结果需要耗费大量时间所带来的推荐系统滞后性问题和推荐结果精度低所带来的非智能性问题上的研究则相对较少。商业上的解决方案是将推荐系统分为线下的计算模块和线上实时推荐模块。线下模块通过提前计算预测推荐结果并存放在数据库供用户使用系统时再进行实时推荐,这样的解决方案能够使用户得到相对实时的推荐服务,但是这样的处理方式仍然不能解决由系统庞大规模带来的海量计算的巨大时间消耗,推荐结果仍然存在滞后性,用户得到的推荐都是系统过去的推荐结果,并不能尽可能地根据用户的行为实时反馈。 GPU原本是一种应用于图形图像处理的多核处理器,它专门为可并行化计算密集型的任务而设计的处理器,拥有非常高的计算能力和非常大的数据吞吐量,同样的任务GPU往往以绝对的效率优势超越CPU的运行表现。 推荐系统主要的耗时部分是线下计算模块,而线下模块主要的耗时任务是相似度模块的计算任务。相似度模块是可以实现并行化处理的过程,因此该部分进行并行设计并移植到GPU上实现。为了达到更好的时间和空间优化,本文使用CSR数据格式方式组织,GPU上的线程使用基于行并行的稀疏矩阵乘法处理算法。另外一个方面,由于矩阵稀疏性问题,本文提出了基于信息关联传递的用户相似度算法,用户之间的相似度为他们之间的直接相似度再加上他们共同好友之间的传递相似度的规则来衡量。实验表明该实现方案能够带来10倍加速并且新算法能够提高20%的精度。实验结果也显示数据越大,加速比就越显著。
[Abstract]:With the explosive development of network information, the overload of information and the passive search process of search engine system, the recommendation engine system has received more and more attention and research. The main research direction of recommendation system is cold start problem. Matrix sparsity problem, recommendation diversity problem, etc., Generally speaking, it is to study and improve the merits and demerits of recommendation results. However, due to the huge scale of recommendation system and the problem of matrix sparsity, it is necessary to predict the recommended results in a large amount of time. There is relatively little research on the problem of system lag and the problem of non-intelligence caused by the low accuracy of recommendation results. The commercial solution is to divide the recommendation system into offline computing module and on-line real-time recommendation module. The module calculates the prediction recommendation results in advance and makes real-time recommendation when stored in the database for users to use the system. Such a solution can enable users to obtain a relatively real-time recommendation service, but this processing method still can not solve the huge time consumption of massive computing brought by the huge scale of the system, and the recommended results are still lagging behind. The recommendations received by users are all the past recommendations of the system, and can not be feedback in real time according to the user's behavior as much as possible. GPU was originally a multi-core processor for graphics and image processing. It is specially designed for parallelizing computationally intensive tasks with very high computing power and very large data throughput. The same task GPU often outperforms CPU performance with absolute efficiency advantage. The main time-consuming part of the recommendation system is the offline computing module, and the main time-consuming task of the offline module is the computation task of the similarity module. Therefore, this part is designed in parallel and implemented on GPU. In order to achieve better time and space optimization, This paper uses CSR data format to organize threads on CSR using row parallel sparse matrix multiplication algorithm. On the other hand, due to the problem of matrix sparsity, this paper proposes a user similarity algorithm based on information association transfer. The similarity between users is measured by the rules of the direct similarity between them and the transfer similarity between their common friends. Experiments show that the proposed scheme can bring 10 times acceleration and the new algorithm can improve 20%. The experimental results also show that the larger the data, The speedup is more significant.
【学位授予单位】:北京邮电大学
【学位级别】:硕士
【学位授予年份】:2013
【分类号】:TP391.3
【参考文献】
相关期刊论文 前8条
1 李聪;梁昌勇;杨善林;;电子商务协同过滤稀疏性研究:一个分类视角[J];管理工程学报;2011年01期
2 焦良葆;陈瑞;;GPU核函数细化研究[J];计算机工程;2010年18期
3 曹小鹏;;基于GPU并行计算及在模式识别中的研究[J];计算机与数字工程;2011年08期
4 田绪红;江敏杰;;GPU加速的神经网络BP算法[J];计算机应用研究;2009年05期
5 周勇;王皓;程春田;郭禾;;基于GPU的多数据流相关系数并行计算方法研究[J];计算机应用研究;2010年04期
6 马超;韦刚;裴颂文;吴百锋;;GPU上稀疏矩阵与矢量乘积运算的一种改进[J];计算机系统应用;2010年05期
7 赵宏霞;王新海;杨皎平;;基于项目因子分析的Web客户需求协同推荐算法[J];计算机系统应用;2011年07期
8 黎明;徐德智;;一种结合基于项目和用户的个性化推荐算法[J];小型微型计算机系统;2011年04期
相关硕士学位论文 前1条
1 颜瑞;基于CUDA的立体匹配及去隔行算法[D];浙江大学;2010年
,本文编号:1543139
本文链接:https://www.wllwen.com/kejilunwen/sousuoyinqinglunwen/1543139.html