基于情感分析的商品评价研究
发布时间:2018-11-22 11:51
【摘要】:身处互联网飞速发展的时代,京东、天猫和亚马逊等在线购物网站在人们的生活中扮演着越来越重要的角色,网上购物成为了重要的购买方式。在网上购物时人们往往通过三个途径获取商品信息,图片、产品参数和评论。卖家已经美化过图片中隐藏的商品信息,产品参数可能过于专业化,并非所有人都可以看懂,评论数据的可读性与丰富性使得评论往往会成为顾客决定是否购买的标尺。但是评论数量是巨大的,如何将这些评论有效整理并建立商品评价模型,帮助顾客挑选商品、帮助卖家改进产品是本文研究的重点。以往的商品评价模型主要有两类,一类是基于产品参数,该方法认为产品的好坏完全是由硬件决定的,忽视了顾客的使用体验,当然省时省力是该方法的优点。另一类是基于问卷调查,该方法将顾客的感觉放在了第一位,但是问卷的设计、发放、回收和整理的过程耗时耗力。而笔者建立的基于评论数据的商品评价模型有着省时省力和贴合用户使用体验的优点。本文在建立商品评价模型时主要完成以下工作:1.数据的获取与清洗。利用python对电商网站的评论数据进行爬取,定制相应爬虫规则。重复的获取数据、虚假评论的重复性和无意义评论之间的相似性,为了减少以上三种情况对于最终评价模型的影响,笔者这利用文本相似度计算对评论数据进行了清洗。2.情感单元的抽取。本文使用基于词典匹配的情感单元提取模型,将不规则的评论数据转化成规范的问卷式数据。为了提高情感抽取的准确性和完整性,笔者使用Apriori模型扩充知网提供的正负面评价词典,最终评估发现该情感模型对于短句情感单元抽取的正确率已经达到90%。3.商品评价模型的建立。即利用LDA模型对评论进行分析,找出评论中潜在主题建立指标体系。接着为了使高质量高认可的评论对于商品最终评价结果影响更大,建立了评价的有效度模型,最终选用了模糊评价模型对商品进行评价分析,模糊矩阵的构造则依靠有效度模型的结果。笔者使用三部小米手机的评论建立基于商品评论的评价模型,通过评价结果可以知道电池容量和手机屏幕方面小米max略胜一筹,与产品参数非常一致。在照相功能上,单纯考虑手机参数小米5s应该获得第一,但是评价结果却是小米5s惜败于小米5,通过分析评论发现小米5s拍照会出现无法对焦、轻微抖动照片不清晰和像素不够的问题。通过分析评价结果可以发现,笔者结合爬虫、情感分析技术和统计知识建立的基于情感分析的商品评价模型,既省时省力,评价结果也非常贴合顾客使用体验。
[Abstract]:In the era of the rapid development of the Internet, online shopping sites such as JingDong, Tmall and Amazon are playing an increasingly important role in people's lives, and online shopping has become an important way to buy. When shopping online, people often get product information, pictures, product parameters and comments through three ways. The seller has beautified the hidden product information in the picture, the product parameter may be too specialized, not everyone can understand. The readability and richness of the comment data make the comment often become the yardstick that the customer decides whether to buy or not. However, the number of comments is huge. How to organize these comments effectively and establish a commodity evaluation model to help customers select products and help sellers to improve their products is the focus of this paper. There are two main types of commodity evaluation models in the past. One is based on product parameters. This method holds that the quality of product is completely determined by hardware and neglects the experience of customers. Of course, saving time and effort is the advantage of this method. The other is based on questionnaire, which puts the customer's feeling first, but the design, distribution, recovery and finishing of the questionnaire are time-consuming and laborious. The commodity evaluation model based on comment data has the advantages of saving time and labor and fitting the user's experience. The main work of this paper is as follows: 1. Data acquisition and cleaning. Using python to crawl the comment data of ecommerce website and customize the corresponding crawler rules. In order to reduce the influence of the above three cases on the final evaluation model, the author uses the text similarity calculation to clean the comment data. The extraction of emotional units. In this paper, an emotional unit extraction model based on dictionary matching is used to transform irregular comment data into standardized questionnaire data. In order to improve the accuracy and completeness of emotion extraction, the author uses Apriori model to expand the dictionary of positive and negative evaluation provided by Zhiwang. Finally, it is found that the correct rate of emotion model for extracting short sentence emotional units has reached 90. 3. The establishment of commodity evaluation model. That is to use LDA model to analyze comments and find out the potential topics in the comments to establish an index system. Then, in order to make the high quality and high recognition comments have more influence on the final evaluation results, the validity model of evaluation is established, and the fuzzy evaluation model is used to evaluate and analyze the goods. The construction of fuzzy matrix depends on the result of effectiveness model. The evaluation model based on commodity review is established by using the comments of three Xiaomi mobile phones. Through the evaluation results, we can know that Xiaomi max is superior in battery capacity and mobile phone screen, which is very consistent with the product parameters. In terms of photographic function, only considering the mobile phone parameter Xiaomi 5s should get the first place, but the evaluation result is that Xiaomi 5s loses to Xiaomi 5s. Through analysis and comments, it is found that Xiaomi 5s will not be able to focus when taking pictures. Slightly jitter the picture is not clear and the pixel is not enough problems. Through the analysis of the evaluation results, we can find that the commodity evaluation model based on emotion analysis, which is based on crawler, emotion analysis technology and statistical knowledge, not only saves time and effort, but also fits the customer experience very well.
【学位授予单位】:安徽财经大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:F713.36
本文编号:2349297
[Abstract]:In the era of the rapid development of the Internet, online shopping sites such as JingDong, Tmall and Amazon are playing an increasingly important role in people's lives, and online shopping has become an important way to buy. When shopping online, people often get product information, pictures, product parameters and comments through three ways. The seller has beautified the hidden product information in the picture, the product parameter may be too specialized, not everyone can understand. The readability and richness of the comment data make the comment often become the yardstick that the customer decides whether to buy or not. However, the number of comments is huge. How to organize these comments effectively and establish a commodity evaluation model to help customers select products and help sellers to improve their products is the focus of this paper. There are two main types of commodity evaluation models in the past. One is based on product parameters. This method holds that the quality of product is completely determined by hardware and neglects the experience of customers. Of course, saving time and effort is the advantage of this method. The other is based on questionnaire, which puts the customer's feeling first, but the design, distribution, recovery and finishing of the questionnaire are time-consuming and laborious. The commodity evaluation model based on comment data has the advantages of saving time and labor and fitting the user's experience. The main work of this paper is as follows: 1. Data acquisition and cleaning. Using python to crawl the comment data of ecommerce website and customize the corresponding crawler rules. In order to reduce the influence of the above three cases on the final evaluation model, the author uses the text similarity calculation to clean the comment data. The extraction of emotional units. In this paper, an emotional unit extraction model based on dictionary matching is used to transform irregular comment data into standardized questionnaire data. In order to improve the accuracy and completeness of emotion extraction, the author uses Apriori model to expand the dictionary of positive and negative evaluation provided by Zhiwang. Finally, it is found that the correct rate of emotion model for extracting short sentence emotional units has reached 90. 3. The establishment of commodity evaluation model. That is to use LDA model to analyze comments and find out the potential topics in the comments to establish an index system. Then, in order to make the high quality and high recognition comments have more influence on the final evaluation results, the validity model of evaluation is established, and the fuzzy evaluation model is used to evaluate and analyze the goods. The construction of fuzzy matrix depends on the result of effectiveness model. The evaluation model based on commodity review is established by using the comments of three Xiaomi mobile phones. Through the evaluation results, we can know that Xiaomi max is superior in battery capacity and mobile phone screen, which is very consistent with the product parameters. In terms of photographic function, only considering the mobile phone parameter Xiaomi 5s should get the first place, but the evaluation result is that Xiaomi 5s loses to Xiaomi 5s. Through analysis and comments, it is found that Xiaomi 5s will not be able to focus when taking pictures. Slightly jitter the picture is not clear and the pixel is not enough problems. Through the analysis of the evaluation results, we can find that the commodity evaluation model based on emotion analysis, which is based on crawler, emotion analysis technology and statistical knowledge, not only saves time and effort, but also fits the customer experience very well.
【学位授予单位】:安徽财经大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:F713.36
【参考文献】
相关期刊论文 前10条
1 张克亮;黄金柱;曹蓉;李峰;;基于HNC语境框架和情感词典的文本情感倾向分析[J];山东大学学报(理学版);2016年07期
2 聂卉;吴毅骏;;基于特征表现的虚假评论人预测研究[J];图书情报工作;2015年10期
3 王博;刘盛博;丁X;刘则渊;;基于LDA主题模型的专利内容分析方法[J];科研管理;2015年03期
4 周练;;Word2vec的工作原理及应用探究[J];科技情报开发与经济;2015年02期
5 蒋翠清;梁坤;丁勇;刘士喜;刘尧;;基于社会媒体的股票行为预测[J];中国管理科学;2015年01期
6 陈磊磊;;不同距离测度的K-Means文本聚类研究[J];软件;2015年01期
7 陈燕方;李志宇;;基于评论产品属性情感倾向评估的虚假评论识别研究[J];现代图书情报技术;2014年09期
8 钱智勇;周建忠;童国平;苏新宁;;基于HMM的楚辞自动分词标注研究[J];图书情报工作;2014年04期
9 付沙;周航军;;关联规则挖掘Apriori算法的研究与改进[J];微电子学与计算机;2013年09期
10 李志宇;;在线商品评论效用排序模型研究[J];现代图书情报技术;2013年04期
相关硕士学位论文 前2条
1 金丽君;基于SVM的搜索型商品评论有用性自动识别方法研究[D];哈尔滨工业大学;2013年
2 李明;针对特定领域的中文新词发现技术研究[D];南京航空航天大学;2012年
,本文编号:2349297
本文链接:https://www.wllwen.com/jingjilunwen/guojimaoyilunwen/2349297.html