基于文本分析的在线评论质量评价模型研究
本文选题:在线评论 切入点:文本分析 出处:《内蒙古大学》2017年硕士论文 论文类型:学位论文
【摘要】:随着网络购物市场的快速发展以及相关购物平台与应用的多样性与便捷性,网上购物给人们的生活带来极大的便利,越来越多的人开始接受与选择这种生活方式。但由于网络商品的虚拟性和不可触摸性,人们无法提前感知欲购产品的质量,于是很多人都倾向于依赖商品的在线评论而做出购买决定。该情形又滋生了一些无良商家通过"好评返现"等各种手段制造出大量商品评论,这不但增加了消费者筛选评论的时间成本,也可能会造成不必要的经济损失。因此,如何快速地识别高质量的在线评论成为当前在线评论内容研究的新课题。本研究从在线评论内容出发,首先提取影响在线评论质量的特征指标,然后构建在线评论质量评价指标体系与模型,最后验证模型性能。具体内容包括如下五个部分:(1)评论文本的有效性标注。通过改进基于长度的自动标注算法和K-means算法,提出Lk-means算法对评论文本进行有效性标注,提取有效性这一指标;(2)指标提取。将在线评论数据分为数值型和文本型两类,二者结合可获得完整性指标;并从数值型评论中提取评分数据,从文本型评论中提取信息量、可读性、主题相关度和一致性这四个指标。(3)构建在线评论质量评价指标体系。根据改进信息质量评价的WRC指标和研究中发现的数据质量评价的1R3C指标,提出本研究的1W2R3C评价指标体系:(4)建立在线评论质量评价模型。首先根据获得的评价指标建立在线评论质量评价模型,然后将评论数据分为训练集和测试集,并利用训练集获得模型中的各评价指标权重和利用测试集验证模型性能。(5)模型性能验证。对模型的性能验证将从两方面进行:一是利用本文提出的1W2R3C指标体系,和WRC与1R3C指标,分别建模进行对比分析;二是基于本文模型训练的指标权重,引入专家打分法和灰色关联度修正法分别获得指标权重,然后进行建模对比分析,由此充分验证模型的优良性能。本文关于在线评论质量评价模型的研究结果,可为深入研究在线评论内容提供一些新的方法和理论依据;用于实践后也可为广大消费者提供相应的决策支持。
[Abstract]:With the rapid development of online shopping market and the variety and convenience of related shopping platforms and applications, online shopping brings great convenience to people's life. More and more people are beginning to accept and choose this way of life. However, because of the virtual and non-touchable nature of online goods, people cannot perceive the quality of products they want to buy in advance. As a result, many people tend to rely on online reviews of goods and make purchase decisions. This has led some unscrupulous businesses to create a large number of reviews through various means, such as "positive reviews". This not only increases the time cost of consumers screening comments, but also may cause unnecessary economic losses. How to quickly identify high quality online reviews has become a new topic in the research of online review content. Based on the content of online comments, this study firstly extracts the characteristic indexes that affect the quality of online reviews. Then the evaluation index system and model of online comment quality are constructed, and the performance of the model is verified. The specific content includes the following five parts: 1) the validity of comment text. By improving the length based automatic tagging algorithm and K-means algorithm, Lk-means algorithm is proposed to annotate the validity of comment text and extract the index of validity. The online comment data can be divided into two categories: numerical and text-type. The integrity index can be obtained by combining the two methods. And extract the score data from the numerical comments, and extract the amount of information from the text comments, readability, According to the WRC index of improving information quality evaluation and the 1R3C index of data quality evaluation found in the research, In this paper, the evaluation index system of 1W2R3C is proposed to establish the online comment quality evaluation model. Firstly, the online comment quality evaluation model is established according to the obtained evaluation index, and then the comment data is divided into training set and test set. The weight of each evaluation index in the model is obtained by using the training set and the model performance is verified by the test set. The performance verification of the model will be carried out from two aspects: one is using the 1W2R3C index system proposed in this paper, and the other is the WRC and 1R3C index. Secondly, based on the index weight of the model training in this paper, the expert scoring method and the grey correlation degree correction method are introduced to obtain the index weight respectively, and then the model is compared and analyzed. The results of this paper can provide some new methods and theoretical basis for further research on online comment content. After being used in practice, it can also provide corresponding decision support for consumers.
【学位授予单位】:内蒙古大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:F724.6;F274;F713.55
【参考文献】
相关期刊论文 前10条
1 唐晓波;邱鑫;;面向主题的高质量评论挖掘模型研究[J];现代图书情报技术;2015年Z1期
2 夏火松;杨培;熊淦;;基于特征提取改进的在线评论有效性分类模型[J];情报学报;2015年05期
3 吴含前;朱云杰;谢珏;;基于逻辑回归的中文在线评论有效性检测模型[J];东南大学学报(自然科学版);2015年03期
4 王倩倩;;一种在线商品评论信息可信度的排序方法[J];情报杂志;2015年03期
5 聂卉;;基于内容分析的用户评论质量的评价与预测[J];图书情报工作;2014年13期
6 靳健;季平;;用于在线产品评论质量分析的Co-training算法[J];上海大学学报(自然科学版);2014年03期
7 陈涛;谢丽莎;;在线评论文本信息质量等级的测量探析——基于模糊综合评价法[J];科技创业月刊;2012年07期
8 吴秋琴;许元科;梁佳聚;张蕾;;互联网背景下在线评论质量与网站形象的影响研究[J];科学管理研究;2012年01期
9 于萍;李克;;使用Microsoft Excel进行数据的灰关联分析[J];微型电脑应用;2011年03期
10 张靖;金浩;;汉语词语情感倾向自动判断研究[J];计算机工程;2010年23期
相关博士学位论文 前1条
1 王素格;基于Web的评论文本情感分类问题研究[D];上海大学;2008年
相关硕士学位论文 前3条
1 徐嘉徽;电子商务用户在线评论信息质量研究[D];吉林大学;2016年
2 杨培;基于改进特征提取的评论有效性分类模型[D];武汉纺织大学;2015年
3 宋惟然;中文文本分类中的特征选择和权重计算方法研究[D];北京工业大学;2013年
,本文编号:1592354
本文链接:https://www.wllwen.com/jingjilunwen/xmjj/1592354.html