当前位置:主页 > 科技论文 > 搜索引擎论文 >

基于情感分类的产品评论垂直搜索引擎的研究

发布时间:2018-08-05 14:18
【摘要】:随着互联网技术的不断发展,电子商务的不断兴起,BBS、博客、微博的不断涌现,商家与购买者的网上交互日趋频繁。越来越多的购买者在使用产品后,将产品的评论发表在网上,评论的数量与日俱增,评论的本身口语化较多并呈非结构化。商家在决策市场供求关系、潜在购买者在购买产品时,若从海量的产品评论中人工的挑选出自己关心的信息,是耗时和费力的,并带有片面性和滞后性。因此搜索引擎在当今互联网中扮演着重要的角色,,像百度、谷歌这样强大的搜索引擎是针对不同领域、不同种类的通用搜索引擎。在特定的产品评论领域中,却显得力不从心。所以,在这样的背景下,一款具有情感分类的产品评论垂直搜索引擎的研究与开发是很有必要的。 在国内外研究现状的基础上,针对中文产品评论文本中评价对象的识别、评价短语的识别、评价对象与评价短语的搭配识别及评价短语的情感倾向性分析,做了进一步研究。主要工作如下: (1)在识别评价对象方法上,利用词性序列获取评价对象候选集,并提出了评价对象的完整性和稳定性的概念及算法,用来过滤评价对象的噪声。利用评价对象与评价短语的同现规则及评价对象在整篇评论文本中或整个语料集中出现的频率,进行评价对象的置信度排序,最终抽取出评价对象。 (2)对连词词典、情感词词典、程度词词典及否定词词典进行了完善,用以识别评价短语及分析评价短语的情感倾向性。并通过评价对象与评价短语之间关系的8个特征,利用支持向量机来识别评价对象与评价短语的搭配关系,最终判断整篇评论文本的情感倾向性。 (3)以中文产品评论文本的情感倾向为基础,利用目前流行的SSH框架、mysql数据库及开源软件包lucene,构建了一个垂直搜索引擎,用户可以方便、快捷的查询自己感兴趣的相关信息。 通过上述的研究所构建的具有情感分类的垂直搜索引擎,使得商家和潜在客户可以从浩如烟海的评论文章中快速而准确的找到对自己有用的信息,具有一定的商业价值。提出的中文文本情感分类的研究方法,具有一定的学术价值。
[Abstract]:With the continuous development of Internet technology and the rising of e-commerce, BBSs, blogs, Weibo constantly emerge, the online interaction between merchants and buyers is becoming more and more frequent. More and more buyers post product reviews on the Internet after using the products, the number of comments is increasing, the comments themselves are more colloquial and unstructured. It is time-consuming and laborious for potential buyers to pick out the information they care about from a large number of product reviews when they make decisions on the supply and demand relationship in the market, and it is one-sided and lagging. So search engines play an important role in the Internet today. Powerful search engines like Baidu and Google are aimed at different fields and different kinds of general search engines. In a particular area of product review, however, appears to be inadequate. Therefore, it is necessary to research and develop a vertical search engine with emotion classification for product reviews. Based on the current research situation at home and abroad, this paper makes a further study on the identification of evaluation objects, the identification of evaluation phrases, the collocation identification between evaluation objects and evaluation phrases, and the emotional orientation of evaluation phrases in Chinese product review texts. The main work is as follows: (1) the candidate set of evaluation object is obtained by using part of speech sequence in the method of identifying evaluation object, and the concept and algorithm of integrity and stability of evaluation object are put forward to filter the noise of evaluation object. Using the cooccurrence rule of evaluation object and evaluation phrase and the frequency of the evaluation object appearing in the whole comment text or the whole corpus, the confidence degree of the evaluation object is sorted, and the evaluation object is extracted. (2) the conjunctive dictionary is selected. The dictionary of affective words, the dictionary of degree words and the dictionary of negative words are perfected to identify the evaluation phrases and analyze the affective tendency of the evaluation phrases. Through the eight features of the relationship between the evaluation object and the evaluation phrase, support vector machine is used to identify the collocation relationship between the evaluation object and the evaluation phrase. Finally, the emotional tendency of the whole review text is judged. (3) based on the emotional tendency of the Chinese product review text, a vertical search engine is constructed by using the popular SSH framework MySQL database and open source software package Lucene. Users can easily and quickly query their own interested information. A vertical search engine with emotion classification is constructed through the above research, which enables merchants and potential customers to quickly and accurately find useful information for themselves from a vast number of review articles, which has certain commercial value. The research method of emotion classification of Chinese text has certain academic value.
【学位授予单位】:湖南工业大学
【学位级别】:硕士
【学位授予年份】:2012
【分类号】:TP391.3

【相似文献】

相关期刊论文 前10条

1 顾鹏尧;;让搜索引擎更好地服务于教育教学[J];科学24小时;2003年Z1期

2 陈新颜;垂直搜索引擎辨析[J];现代情报;2004年09期

3 胡文胜;;垂直搜索助号码百事通与商务领航[J];每周电脑报;2006年32期

4 胡洁;丁宁;关静;曹福年;张磊;;基于“PUBMED+PDF”的医学垂直搜索引擎的实践[J];信息系统工程;2009年05期

5 一林;;垂直搜索:前进路上的喜与忧[J];互联网天地;2010年02期

6 牟思;;基于垂直搜索引擎的学校网站的研究与建设[J];中国教育技术装备;2011年21期

7 田野;垂直搜索火热为哪般[J];中国计算机用户;2005年37期

8 胡文胜;;垂直搜索助号码百事通与商务领航[J];每周电脑报;2006年31期

9 边凯;;你会搜索吗?[J];中国计算机用户;2007年23期

10 宿建光;;指点通:移动垂直搜索的创新者[J];通信世界;2007年03期

相关会议论文 前10条

1 王上;于海;王钲旋;;Deep Web垂直搜索引擎设计与实现[A];第26届中国数据库学术会议论文集(B辑)[C];2009年

2 林欢欢;王文杰;史忠植;;移动环境下垂直搜索引擎[A];第三届全国信息检索与内容安全学术会议论文集[C];2007年

3 王旭;杜军平;;质检总局互联网舆情监控系统中聚焦爬虫的研究[A];中国电子学会第十七届信息论学术年会论文集[C];2010年

4 赵[

本文编号:2166049


资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/sousuoyinqinglunwen/2166049.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户c251c***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com