当前位置:主页 > 管理论文 > 移动网络论文 >

在线社区用户评论有用性研究

发布时间:2018-05-05 02:22

  本文选题:Web挖掘 + 用户评论 ; 参考:《山东大学》2014年硕士论文


【摘要】:近年来互联网和Web2.0的迅速发展,特别是社交网络和微博的兴起,给网络的使用方式带来了巨大的改变。现在,每个用户都同时是内容的消费者和内容的创造者。而互联网巨大的用户群体,导致用户贡献内容(UGC)的爆发式增长。所有用户都可以自由地创造内容,这在一定程度上改革了传统的互联网模式,同时也带来了内容质量的参差不齐。如何有效地从这些内容中获取有价值的信息,也逐渐吸引了很多演研究者的关注。 本文以豆瓣阅读中的书评为研究对象,从分类和排序两种角度研究了用户评论的有用性。与亚马逊等电子商务网站中的用户评论相比,书籍评论有两个明显不同的特点,一是豆瓣读书的书籍评论一般较长,平均长度超过千字,长评论意味着内容的多样和复杂,另一点是书籍属于体验性的物品,书评则带有个人创作的性质,风格和内容同样重要。因此,本文尝试从评论内容和写作风格,以及评论者的相关信息等方面抽取有用的特征。 本文的主要贡献包括两部分。首先,以用户投票为衡量评论有用性的标准,建立了一个可用的评论数据集,并在该数据基础上,分析研究了用户评论和投票的特点。进一步,从数据中抽取出内容和风格相关的特征,以及评论者的相关特征,分别以分类方法和排序方法对评论有用性进行学习建模。其中针对词汇特征,本文提出了一种与评论主题相关的权重方式,实验表明,该权重方式在分类模型和排序模型中都优于单纯的词汇频率或TFIDF方式。本文的实验结果也说明,对评论有用性的挖掘而言,排序模型是一种更为合理的方式。
[Abstract]:In recent years, the rapid development of the Internet and Web2.0, especially the rise of social networks and Weibo, has brought great changes to the use of the network. Now, each user is both a consumer of content and a creator of content. And Internet huge user group, cause user to contribute content to rise explosively. All users are free to create content, which to some extent changes the traditional Internet model, but also brings about the uneven quality of content. How to effectively obtain valuable information from these contents has gradually attracted the attention of many acting researchers. This paper studies the usefulness of user reviews from the perspectives of classification and sorting. Book reviews have two distinct characteristics compared with user reviews on e-commerce sites such as Amazon. One is that the reviews of Douban books are generally longer, with an average length of more than a thousand words. Long reviews mean diversity and complexity of content. The other is that books are experiential items, and book reviews are as important as personal creation, style and content. Therefore, this paper attempts to extract useful features from the content and writing style of the commentary, as well as the relevant information of the reviewer. The main contribution of this paper consists of two parts. Firstly, a useful comment data set is established based on user voting, and the characteristics of user comment and voting are analyzed. Furthermore, the content and stylistic features are extracted from the data, and the relevant features of the reviewers are extracted, and the learning models of the usefulness of the reviews are modeled by the classification method and the sorting method, respectively. According to the lexical features, this paper proposes a weighting method related to the topic of comment. The experiments show that the weight method is superior to the simple word frequency or TFIDF method in the classification model and the sorting model. The experimental results also show that the ranking model is a more reasonable method for mining the usefulness of comments.
【学位授予单位】:山东大学
【学位级别】:硕士
【学位授予年份】:2014
【分类号】:TP393.09

【共引文献】

相关期刊论文 前10条

1 张昊e,

本文编号:1845758


资料下载
论文发表

本文链接:https://www.wllwen.com/guanlilunwen/ydhl/1845758.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户b88d5***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com