基于协同过滤的个性化新闻推荐系统的研究与实现

发布时间：2019-03-09 15:35

【摘要】：随着互联网的快速发展,信息呈爆炸式增长,用户逐渐由信息匮乏时代迈入了信息过载时代——过量信息反而使得用户无法找到自己需要的信息。为了方便互联网用户快速查找到所需信息,研究者提出了很多方法：门户网站,相对专业的信息源；分类目录,对热门网站分门别类；搜索引擎,只需输入关键词就能找到所需的信息。但用户需求不止于此,用户很多时候并没有明确信息获取指向,个性化推荐技术以其能够过滤大量用户不感兴趣的内容,帮助用户发现自身潜在喜欢的内容,得到了广泛应用。随着个性化推荐在电子商务领域大放异彩,个性化推荐技术逐步应用到其他领域,比如个性化新闻推荐。互联网步入到大数据时代,也给个性化新闻阅读发展提供了良好的机遇。新闻个性化推荐系统在理论研究中取得了长足进展,但仍有很多问题亟待解决：可扩展性问题、时效性问题、冷启动问题、数据稀疏性问题等,因此高效可扩展的个性化新闻推荐系统是论文的研究重点。本文的主要工作为： 1.提出新的相似度计算方法,结合行为相似度和内容相似度,解决了传统相似度计算方法计算不准确或无法计算的问题,解决了协同过滤推荐数据稀疏性问题。 2.提出新的适合个性化新闻推荐的可扩展聚类方法,更改了中心点选取方式和距离度量方式,使得新闻推荐系统的可扩展性大大提高。 3.在个性化新闻推荐系统相似度计算阶段和最终推荐阶段融入了时间因素,保证了所推荐新闻的时效性。 4.基于MapReduce模型实现整个协同过滤新闻推荐系统,使得个性化新闻推荐系统能够并行运行,可扩展性大大提高,适应了海量新闻和海量用户的个性化推荐需求。 5.对聚类方法和个性化新闻推荐方法进行了实验,确定了相关参数,对最终基于协同过滤的个性化新闻推荐系统进行了功能测试,验证了推荐系统相关功能。论文首先分析了当前个性化推荐技术的研究现状和Hadoop云计算平台,阐述了论文提出的个性化新闻推荐的聚类方法和基于多维相似度的个性化推荐算法,最后给出了基于MapReduce模型实现的新闻推荐系统,并给出了详细的测试和评估结果。
[Abstract]:With the rapid development of Internet and the explosive growth of information, users have gradually stepped into the era of information overload from the era of lack of information-excessive information makes it impossible for users to find the information they need. In order to facilitate Internet users to quickly find the required information, researchers have proposed many methods: portal sites, relative professional information sources, classification catalogs, classification of popular websites, and so on. Search engine, just enter keywords to find the required information. However, users need more than this, users often do not have clear information access direction, personalized recommendation technology to filter a large number of users are not interested in content, help users to find the potential content they like, has been widely used. With the development of personalized recommendation in the field of e-commerce, personalized recommendation technology is gradually applied to other fields, such as personalized news recommendation. Internet into the era of big data, but also provide a good opportunity for the development of personalized news reading. News personalized recommendation system has made great progress in theoretical research, but there are still many problems to be solved, such as scalability, timeliness, cold start, data sparsity and so on. Therefore, efficient and scalable personalized news recommendation system is the focus of this paper. The main work of this paper is as follows: 1. This paper proposes a new similarity calculation method, which combines behavior similarity with content similarity, solves the problem that the traditional similarity calculation method is inaccurate or unable to calculate, and solves the sparsity problem of collaborative filtering recommendation data. 2. A new scalable clustering method suitable for personalized news recommendation is proposed, which changes the way of selecting the center point and the way of distance measurement, which greatly improves the scalability of the news recommendation system. 3. The time factor is incorporated into the similarity calculation stage and the final recommendation stage of personalized news recommendation system, which ensures the timeliness of the recommended news. 4. Based on the MapReduce model, the whole collaborative filtering news recommendation system is implemented, which makes the personalized news recommendation system run in parallel, greatly improves the scalability, and adapts to the personalized recommendation needs of mass news and mass users. 5. The clustering method and personalized news recommendation method are experimented, and the related parameters are determined. Finally, the function test of personalized news recommendation system based on collaborative filtering is carried out, and the related functions of the recommendation system are verified. Firstly, this paper analyzes the current research status of personalized recommendation technology and Hadoop cloud computing platform, and expounds the clustering method of personalized news recommendation and the personalized recommendation algorithm based on multi-dimensional similarity. Finally, a news recommendation system based on MapReduce model is given, and the test and evaluation results are given in detail.
【学位授予单位】：北京邮电大学
【学位级别】：硕士
【学位授予年份】：2013
【分类号】：TP391.3

【参考文献】