评价对象识别模型与方法研究

发布时间：2018-08-19 15:48

【摘要】：随着互联网技术的发展,电子商务成为人们日常生活中越来越不可缺少的一部分,随之而来的是用户意见和评论数据量的飞速增长。这些评论中包含了用户对某一领域相关功能、属性和物品等的各种评价信息。有效地利用这些评论信息对于改善产品质量、了解消费者的真实需求都有很大的帮助,这也就促使评价对象识别技术的产生和发展。评论信息中的评价对象就是观点持有者表达情感的目标实体,通常由一个或多个单词组成。评价对象识别就是在给定的商品评论中准确地提取真实的评价实体。从方法的角度,评价对象识别方法可以分为有监督学习,无监督学习和半监督学习;从应用的角度,评价对象识别可以分为单领域问题和跨领域问题。本文将对单领域评价对象识别问题的模型与方法进行研究,通过对比各模型与方法的试验结果,分析各模型与方法的优缺点。本文的主要研究内容可以归纳为以下三点:第一,基于无监督学习的评价对象识别方法。首先本文采用了数据挖掘技术中的关联规则挖掘方法提取出语料库中最常出现的名词短语作为候选对象,再根据词语的语义相关度进行进一步的过滤,得出语句中的评价对象的候选集合。在此基础上,本文采用一种基于句法分析树和二次传播算法的评价对象识别方法,分别用以识别名词短语构成的评价对象和出现频率较低的评价对象。第二,基于时序模型的评价对象识别方法。由于评论信息是一种上下文相关的单词序列,采用时序模型可以有效地利用上下文信息,增加评价对象识别的准确性。本文提取了单词层面特征、句法层面特征以及外部语料特征等作为模型的输入,使用条件随机场模型学习这些特征之间的相互关系。实验证明,特征组合对结果有着很大的影响。在给定合适特征的条件下,时序模型可以取得非常优异的结果。第三,基于循环神经网络的评价对象识别。循环神经网络是一种端对端的模型,可以省去繁琐的预处理过程和特征提取过程。本文对比几种常见的循环神经网络模型在评价对象识别任务上的表现,分析循环神经网络在该任务上的优势与不足。针对循环神经网络不能有效地获取输出标签间的相互依赖关系的问题,本文还提出了一种新型的循环神经网络:输出感知循环神经网络。实验证明输出感知循环神经网络不仅在效果上好于其他循环神经网络,而且有着更快的收敛速度。
[Abstract]:With the development of Internet technology, electronic commerce has become an indispensable part of people's daily life, followed by a rapid increase in the amount of user opinions and comments. These comments contain a variety of user evaluation information about functions, attributes, items, etc. It is helpful to improve the quality of products and understand the real needs of consumers by using these comments effectively, which promotes the production and development of object recognition technology. The object of evaluation in comment information is the object entity of the viewpoint holder expressing emotion, which is usually composed of one or more words. Evaluation object identification is to extract the real evaluation entity accurately from a given commodity comment. From the point of view of method, evaluation object recognition can be divided into supervised learning, unsupervised learning and semi-supervised learning, and from the perspective of application, evaluation object recognition can be divided into single-domain and cross-domain problems. In this paper, the models and methods of single domain object identification are studied, and the advantages and disadvantages of each model and method are analyzed by comparing the experimental results of each model and method. The main contents of this paper can be summarized as follows: first, an evaluation object recognition method based on unsupervised learning. First of all, this paper uses association rule mining method in data mining technology to extract the most common noun phrases in the corpus as candidate objects, and then filter further according to the semantic relevance of words. A candidate set of evaluation objects in a statement is obtained. On this basis, this paper uses an evaluation object recognition method based on syntactic parse tree and quadratic propagation algorithm, using the evaluation object which is composed of identifying noun phrases and the evaluation object with low occurrence frequency, respectively. Second, the evaluation object recognition method based on time series model. Because the comment information is a kind of context-dependent word sequence, the temporal model can effectively utilize the context information and increase the accuracy of object identification. In this paper, word level feature, syntactic level feature and external corpus feature are extracted as the input of the model, and the conditional random field model is used to learn the relationship between these features. Experimental results show that the combination of features has a great impact on the results. The time series model can obtain excellent results under the condition of given suitable features. Third, the evaluation object recognition based on cyclic neural network. Cyclic neural network is an end-to-end model, which can eliminate the tedious preprocessing process and feature extraction process. In this paper, the performance of several common cyclic neural network models in evaluating object recognition task is compared, and the advantages and disadvantages of cyclic neural network in this task are analyzed. Aiming at the problem that cyclic neural network can not effectively obtain the interdependence between output labels, a new type of cyclic neural network, output perceptual cyclic neural network, is proposed in this paper. The experimental results show that the output perceptual cyclic neural network not only has better effect than other cyclic neural networks, but also has a faster convergence rate.
【学位授予单位】：哈尔滨工业大学
【学位级别】：硕士
【学位授予年份】：2016
【分类号】：TP391.1
，

本文编号：2192122

资料下载

论文发表

支付宝下载

Download by Alipay
微信下载

Download by Wechat
会员下载

Download by Member

本文链接：https://www.wllwen.com/jingjilunwen/dianzishangwulunwen/2192122.html

上一篇：河北省农村电子商务发展模式探讨
下一篇：电子商务安全问题及解决方案

论文发表

·知网|万方|维普|龙源|省级|国家级|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|