基于社会化媒体的自适应信息推荐机制研究

发布时间：2018-06-22 06:12

本文选题：推荐系统 + 社会化媒体　；参考：《西南财经大学》2011年硕士论文

【摘要】：由于互联网的优越特性,在其上发布信息极为便捷,这就使得互联网上的信息数量以近乎爆炸的速度增长。如此多的信息即使浏览一遍都无法做到,用户希望能找到感兴趣的部分更是不可能的。传统的搜索方法只能呈现给所有用户一样的排序结果,无法针对不同用户的兴趣偏好提供相应的服务。信息的爆炸使得信息的利用率反而降低,这种现象被称之为“信息过载”。推荐系统是为解决互联网上的信息过载问题而提出的一种智能代理系统,能从互联网的大量信息、中向用户自动推荐出符合其兴趣偏好或需求的资源。在当前Web 2.0的环境下,社会化媒体的出现使得用户不仅是网络内容的浏览者,也是网络内容的制造者。它的发展进一步加剧了网络时代的信息爆炸。传统的推荐系统通过让用户回答问题或者主动定制的方式来获取用户的兴趣,进而实现推荐。然而,用户的兴趣不是一成不变的,它会随着时间的推移而变化。针对该点,本文提出了一种自适应信息推荐机制,来及时跟踪用户兴趣变化,推荐用户感兴趣的资源。社会化媒体形式多样,如论坛、博客、内容社区、社交网络等。在这些形式下,用户可以发布或者转帖一篇文章,其他用户可以对其阅读或评论,这些评论本身又会被其他用户阅读或评论。从用户评论中,可以观察出用户当前感兴趣的话题。传统的基于内容的推荐方法一般根据原文的内容信息来推荐相关文章。然而,我们知道,随着用户讨论的继续,讨论的主题也会发生变化,即用户兴趣也会发生变化。这时,如果仅仅依据原文本体进行推荐,则返回的文章往往不是用户当前最感兴趣的,从而会降低用户的满意度。因此,本文考虑了结合用户评论和原文本体来构建主题模型,利用该模型来选择相关文章。根据观察发现,每条评论对推荐结果的影响应该是不一样的,如有些评论对原文内容有深刻的见解,而有些评论完全是无意义的讨论。所以,当利用用户评论信息来跟踪主题演变时,区分开每条评论的影响非常重要。这里,我们从用户评论中抽取出评论间语义关系、结构关系以及用户权威来区别每条评论对推荐的影响。分析事件报道在网络上的传播,可以发现其存在如下四个特点：转载重合、报道重合、包含重合和追踪重合。这些特点使得基于内容的推荐系统存在一个严重问题—重复推荐,即推荐文章的内容与原文含有相同的信息,这样会增加用户的阅读负担。于是,本文提出了一种方法来解释推荐文章与原文本体之间的逻辑关系(包括一般化、特殊化和重复),以此降低重复内容的推荐,推荐出符合用户需求的文章。本文第一部分介绍了课题的研究背景、研究目的和意义,对文中涉及到的一些基本概念作了简单介绍。介绍了推荐系统的定义；四种主要方法,即基于内容的推荐、协同过滤推荐、混合型推荐和基于数据挖掘技术的推荐；针对四种方法,分别以一个系统实例解释其工作模式；对推荐系统的评测标准进行了汇总。还介绍了社会化媒体的概念以及与传统媒体相比,其具有的一些特点。最后,总结了本文的主要工作和贡献如下： (1)本研究是在国内外率先结合用户评论来协助信息推荐服务的研究,为基于社会化媒体的信息推荐研究提供一条崭新的研究思路,将信息推荐的研究从Web 1.0的传统静态媒体延伸到了Web 2.0的社会化媒体模式。 (2)为了充分利用社会化媒体的用户交互体验特征,我们独创性地设计了一套基于图论的用户评论信息挖掘机制,可以准确地捕捉用户对事件的关注焦点,并将其与原文本体内容相结合,使得推荐的结果既反映了作者的观点,也反映了读者的观点。 (3)为了减轻用户的认知负担,我们创新性地提出了一套基于信息熵理论来判断文本逻辑关系的机制。通过该机制,我们可以获得推荐文章与原文章的逻辑关系。此外,该研究成果可以广泛地应用到文本分析的内容逻辑判断中。例如,搜索引擎的结果呈现,基于内容的广告设置等。本文第二部分介绍了该课题的研究基础与背景。首先,针对本文的实验对象,即新闻和博客,对已有的相关研究工作进行了总结。新闻推荐从现有的商业新闻推荐系统和学术研究两个方面进行了介绍。接着,针对文中存在的主题漂移问题,对主题检测与跟踪技术的研究发展进行了汇总。最后,对本文将涉及到的相关理论知识作了简要介绍,如语言模型,PageRank算法、信息熵、T检验等。本文第三部分是核心部分,介绍了自适应信息推荐机制的设计。首先,展示了总体系统框架图,并对其运作流程进行简单介绍。然后,针对框架中的各个模块进行详细阐述。通过用户间关系建模计算用户权威,这里的关系包括了引用关系与回复关系。在整个社区中,根据一个用户对另一个用户的信息进行引用或者回复来构建图模型,然后利用PageRank算法计算每个用户的权威。接着,计算评论权重。这里,我们同样利用了图模型,不同的是,现在的模型是建立在用户评论之间的关系上,这里的关系包括了语义、引用和回复关系。语义关系指的是两条评论之间的内容相似性,引用或回复关系指的是一条评论对另一条评论的信息引用或者回复。模型构建好后,也利用PageRank算法得出评论的权重。一条评论质量的好坏,由其作者的权威和评论本身共同决定,因此,我们将用户权威和评论权重结合起来,计算出每条评论的最终权重。其次,将这些权重信息和原文本体、用户评论一起输入到合成器中,构建主题模型。利用该主题模型从数据库中检索出相关文章。最后,根据信息熵理论来解释相关文章与原文本体之间的逻辑关系,返回符合用户兴趣的文章。本文第四部分是实验设计与分析。介绍了系统开发环境、实验数据的获取以及详细信息。实验数据包括两部分：一个是新闻数据集,一个是博客数据集。由于我们获取的是整个网页数据,所以需要对网页进行解析,抽取出所需部分。还介绍了评测标准的选取,为了评测目的,我们除了选用一些常用的指标,还引入了一个新的评测指标—新颖度,来度量返回文章的主题多样性。接着,设计了一系列实验：1)将本文提出的方法与两种常用方法进行比较,结果表明,在新闻和博客数据集上,我们的方法都明显优于其它两种；2)分析了用户权威和评论对推荐效果的影响,实验结果表明结合用户权威和评论信息有利于提高推荐效果；3)分析了评论间关系对推荐效果的影响,实验结果显示,针对不同的文本形式,有不同的推荐效果。对于新闻数据,结合用户评论间的内容关系会导致推荐效果的降低；然而,对于博客数据,结合用户评论间的内容关系有助于推荐效果的提高；4)对推荐关系解释进行了评估。本文的最后一部分是对本文研究工作的总结和未来研究工作的展望。总结了本文研究的基于社会化媒体的自适应信息推荐系统的整体设计；针对本文的研究工作,指出了其存在的一些不足之处,并给出了以后的发展方向。
[Abstract]:Because of the advantages of the Internet, it is very convenient to publish information on it, which makes the number of information on the internet almost explosive. So much information can not be done even if you browse through it. It is impossible for the user to find the part of interest. The traditional search method can only be presented to all users. The sorting results can not provide services to different users' interest preferences. Information explosion makes the utilization of information reduced, which is called "information overload". The recommendation system is a kind of intelligent agent system for solving the problem of information overload on the Internet, which can get a large amount of information from the Internet, Users automatically recommend resources that meet their interest preferences or needs.
In the current environment of Web 2, the emergence of social media makes users not only the browsers of network content, but also the maker of network content. Its development further exacerbates the information explosion in the network era. However, the interest of the user is not constant, and it will change with time. In this paper, an adaptive information recommendation mechanism is proposed to track users' interest changes in time and recommend the resources of interest to users. The social media forms are diverse, such as forums, blogs, content communities, social networks and so on. In some forms, a user can publish or post an article, other users can read or comment on it, and the comments themselves will be read or commented by other users. From the user reviews, the topic of the user's current interest can be observed. The traditional content based recommendation method is generally recommended according to the content information of the original text. We know, however, that as the user talks continue, the topic of the discussion will change, that is, the user's interest will change. Then, if the text is recommended only according to the original text ontology, the returned article is often not the user's current most interested, which will reduce the user's satisfaction. Therefore, this article considers the combination of use. According to observation, the impact of each comment on the recommended results should be different, for example, some comments have profound views on the original content, and some comments are totally meaningless. In the evolution of a problem, it is very important to distinguish the impact of each comment. Here, we extract the semantic relationship between the comments, the structure relationship and the user authority to distinguish the impact of each comment on the recommendation. These features make the content based recommendation system a serious problem - repeat recommendation, that is, the content of the recommended article has the same information as the original, which will increase the user's reading burden. Therefore, this article proposes a method to explain the logical relationship between the recommendation and the original text Ontology (package). It includes generalization, specialization and duplication, so as to reduce duplication of content and recommend articles that meet users' needs.
The first part of this paper introduces the background of the research, the purpose and significance of the research, introduces some basic concepts involved in the paper. It introduces the definition of the recommendation system; four main methods, namely, content based recommendation, collaborative filtering recommendation, mixed recommendation and data mining based recommendation; for the four methods, A system example is used to explain its work pattern, and the evaluation criteria of the recommended system are summarized. The concept of social media and some characteristics compared with the traditional media are also introduced. Finally, the main work and contributions of this paper are summarized as follows:
(1) this study is the first to assist in the research of information recommendation service at home and abroad. It provides a new research idea for the research of information recommendation based on social media, and extends the research of information recommendation from the traditional static media of Web 1 to the social media model of Web 2.
(2) in order to make full use of the user interactive experience characteristics of social media, we have designed a set of user commentary information mining mechanism based on graph theory, which can accurately capture the focus of attention to the event and combine it with the original content of the original, so that the recommended results reflect both the author's views and the reading. The point of view.
(3) in order to reduce the user's cognitive burden, we innovatively put forward a set of mechanism based on information entropy theory to judge the logical relationship of text. Through this mechanism, we can obtain the logical relationship between the recommended article and the original article. In addition, the research results can be widely used in the logical judgment of the content of text analysis. For example, search. Engine results are presented, content based advertising settings, etc.
The second part of this paper introduces the research foundation and background of the subject. Firstly, it summarizes the existing research work on the subjects of this paper, that is news and blogs. The news recommendation is introduced from two aspects of the existing commercial news recommendation system and academic research. Then, the subject drift problem exists in this paper. The research and development of topic detection and tracking technology are summarized. Finally, the relevant theoretical knowledge involved in this paper is briefly introduced, such as language model, PageRank algorithm, information entropy, T test and so on.
The third part of this paper is the core part, which introduces the design of adaptive information recommendation mechanism. First, the framework of the system is presented, and its operation process is briefly introduced. Then, each module in the framework is described in detail. The user authority is calculated by modeling the relationship between users. The relationship here includes the reference relationship and the relationship. In the whole community, in the whole community, a graph model is constructed based on the reference or reply of one user to another user. Then the authority of each user is calculated using the PageRank algorithm. Then, the weight of the comment is calculated. Here, we also use the graph model, and the different models are based on the user reviews. In relation, the relationship here includes semantic, reference and reply relations. Semantic relations refer to the content similarity between two commentaries. The reference or reply relation refers to the reference or reply of a comment to another comment. After the model is built, the weight of the comment is obtained by using the PageRank algorithm. A good quality of the comment is good. It is decided by the author's authority and the comment itself. Therefore, we combine the user authority and the weight of comments to calculate the final weight of each comment. Secondly, the weight information is entered into the synthesizer with the original text ontology and user reviews, and the main problem model is constructed. Finally, according to the theory of information entropy, it explains the logical relationship between the relevant articles and the original ontology, and returns articles that meet users' interests.
The fourth part of this paper is the design and analysis of the experiment. It introduces the system development environment, the acquisition of experimental data and the detailed information. The experimental data includes two parts: one is the news data set and the other is a blog data set. In order to evaluate the evaluation criteria, in order to evaluate the purpose, we have introduced a new evaluation index, novelty, to measure the theme diversity of the article. Then, a series of experiments are designed: 1) comparing the proposed method with the two common methods, and the results show that in news and blog. On the data set, our methods are obviously better than the other two; 2) analysis of the influence of user authority and comment on the recommendation effect. The experimental results show that the combination of user authority and comment information is beneficial to improving the recommendation effect; 3) analysis of the influence of the relationship between comments on the recommendation effect, the experimental results show that there are different text forms, Different recommendations. For news data, combining the content relationship between user reviews can lead to a reduction in the recommendation effect; however, for the blog data, the content relationship of the user reviews helps to improve the recommendation effect; 4) evaluation of the recommendation relationship interpretation.
The last part of this paper is the summary of the research work in this paper and the prospect of the future research work. It summarizes the overall design of the adaptive information recommendation system based on the social media, and points out some shortcomings in the research work, and gives the future development direction.
【学位授予单位】：西南财经大学
【学位级别】：硕士
【学位授予年份】：2011
【分类号】：TP391.3

【参考文献】