教材在线评论的情感倾向性分析研究
发布时间:2018-03-31 17:15
本文选题:教材在线评论 切入点:细粒度情感分析 出处:《新疆师范大学》2017年硕士论文
【摘要】:随着电子商务的迅猛发展,网上书店已经成为很多商家销售图书的重要平台,网络购物因其价格实惠、购买便利等优势,逐渐成为人们购买图书的首选方式。越来越多的用户阅读图书后,也热衷于在线分享自己对所购图书的真实看法或体验。电商网站中涌现的大量图书评论,蕴含着用户对图书的评价,潜在消费者可以据此降低购买风险,从而获得满意的购物结果,商家和出版社也能根据其做出合理有效的决策。可见图书在线评论的挖掘,对消费者、商家和出版社有很重要的意义和实用价值。本文运用细粒度情感分析技术,分析教材类图书的在线评论,挖掘教材特征级的情感倾向性分析结果,为消费者和商家提供有价值的参考信息。本文首先分析了粗颗粒度和细颗粒度在线评论情感倾向性分析的国内外研究现状,其次详细调研了细粒度情感分析的相关理论和技术,明确了情感分析的步骤和每步中的关键技术。在此基础上,通过网页爬虫软件采集教材的在线评论信息,对采集数据进行去重、清洗、拼音英语替换等去噪处理,形成教材评论分析的训练和测试语料。利用中文分词软件和自定义分词词典,完成并优化评论语料的分词和词性标注。然后,基于标注结果,根据产品特征通常为名词和名词性短语的规律,归纳了名词性短语的构词规则,利用该规则从训练语料中抽取候选产品特征,通过词频过滤和人工校验进行筛选,建成教材产品特征词库。接着,根据教材评论的领域特性,在通用情感词典的基础上,利用训练语料构建了领域情感词典、网络情感词典和极性修饰情感词典,形成面向教材评论的情感词典资源。最后,分析了现有SBV算法运用于教材评论时还无法识别某些特征-意见对的问题,提出改进思路,利用本文构建的极性词典和特征词库,设计教材评论文本的情感倾向性分析算法。通过测试语料进行实验,分析结果表明,本文算法和词典资源相比通用情感词典和SBV算法,评价指标明显提升,从而证明了本文构建资源和算法设计的有效性。
[Abstract]:With the rapid development of electronic commerce, online bookstore has become an important platform for many merchants to sell books. Because of its advantages of affordable price and convenient purchase, online shopping has gradually become the first choice for people to buy books.More and more users are keen to share their true views and experiences of the books they buy.A large number of book reviews emerge in e-commerce websites, which contain the evaluation of books by users. The potential consumers can reduce the purchase risk and obtain satisfactory shopping results. The merchants and publishers can also make reasonable and effective decisions according to them.It can be seen that the mining of online reviews of books is of great significance and practical value to consumers, merchants and publishers.In this paper, the fine-grained emotion analysis technology is used to analyze the online reviews of textbook books, and to excavate the result of affective tendency analysis at the characteristic level of textbooks, which provides valuable reference information for consumers and merchants.In this paper, firstly, the current situation of the research on coarse-grained and fine-grained online reviews of affective tendency analysis is analyzed, and then the relevant theories and techniques of fine-grained emotional analysis are investigated in detail.The process of affective analysis and the key techniques in each step are defined.On this basis, the online comment information of the textbook is collected by the web crawler software, and the data is removed, cleaned and replaced by the Pinyin English to form the training and testing corpus for the review analysis of the textbook.The Chinese word segmentation software and the custom word segmentation dictionary are used to complete and optimize the word segmentation and part of speech tagging of the comment corpus.Then, based on the tagging results, according to the rule that product features are usually nouns and noun phrases, the word-formation rules of nominal phrases are summarized, and the candidate product features are extracted from the training corpus by using this rule.Through word frequency filtering and manual check to screen, build the textbook product feature lexicon.Then, according to the domain characteristics of textbook review and on the basis of the general emotion dictionary, the domain emotion dictionary, the network emotion dictionary and the polarity modified emotion dictionary are constructed by using the training corpus to form the emotion dictionary resources for the textbook review.Finally, this paper analyzes the problem that the existing SBV algorithm can not recognize some character-opinion pairs when it is used in the textbook review, and puts forward some improved ideas, and makes use of the polarity dictionary and feature lexicon constructed in this paper.Design an algorithm for analyzing the emotional orientation of the review text of the textbook.The experimental results show that compared with the general emotion dictionary and SBV algorithm, the evaluation index of this algorithm is significantly improved, which proves the validity of the resource and algorithm design in this paper.
【学位授予单位】:新疆师范大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP391.1;G423.3
【参考文献】
相关期刊论文 前10条
1 陈国兰;;基于情感词典与语义规则的微博情感分析[J];情报探索;2016年02期
2 刘丽;王永恒;韦航;;面向产品评论的细粒度情感分析[J];计算机应用;2015年12期
3 刘玉娇;琚生根;伍少梅;苏,
本文编号:1691681
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/1691681.html