当前位置:主页 > 科技论文 > 自动化论文 >

基于深度学习的答案融合方法研究

发布时间:2018-09-04 08:20
【摘要】:自动问答系统是自然语言处理领域的一个重要任务。以“问答对”为基本成分的语料库是自动问答系统答案的主要来源,语料库中的“问答对”一般都从百度知道、知乎等问答社区中抽取的。然而,问答社区中的一个问句通常有多个答案,从不同的角度回复问句,自动问答社区中的答案却只选取其中一个答案作为问句的回复,这就导致语料库中的答案不够全面。因此,本文研究答案融合方法,将多个候选答案进行融合,从而解决自动问答系统语料库存在的不全面、冗余等问题。本文使用深度学习方法、注意力机制等方法解决答案融合问题。答案融合方法是从多个候选答案中抽取答案,因此答案抽取的准确性,决定了答案融合方法结果的准确性及全面性。同时答案融合方法得到答案是从多个候选答案中抽取的,语义存在着不连贯、可读性差的问题。因此本文从答案自动抽取及语义连贯性两个方面提升答案融合效果。答案自动抽取能够从多个候选答案抽取中能够答案问题的答案句,使答案更加精简、更加全面。语义连贯性通常表现为段落内的句子顺序,因此使用句子排序方法解决答案语意连贯性问题,增强候选答案间的语意连贯性,使答案融合结果可读性更好,语义更连贯。本文主要研究工作围绕答案自动抽取以及句子排序展开,分为以下四个方面:1、基于词共现的答案自动抽取模型。本文利用句内注意力机制对问句及答案句进行特征提取,同时针对语料,引入词共现特征、文档倒数特征、词相似度特征,并采用随机采样方法处理语料中存在的数据不平衡问题。对比基线方法,基于词共现的答案自动抽取模型能够提高抽取答案的准确度。2、基于句子匹配的句子排序方法。本文将深度学习方法引入句子排序中,使用深度学习方法解决句子排序问题,同时将句子匹配方法引入句子排序中,对比基线方法,模型提高了句子排序方法的效果。3、基于注意力机制的句子排序方法。为了增强句子排序模型捕捉语义逻辑关系的能力,将注意力机制引入句子排序任务中,实现了基于静态注意力机制的句子排序模型、基于词对齐注意力机制的句子排序模型以及基于句内注意力机制的句子排序模型。基于注意力机制的句子排序方法能够有效捕捉句子间语义逻辑关系,提升句子排序效果。4、答案融合系统设计与实现。对答案自动抽取模块及句子排序模块进行整合,实现答案融合系统,解决语料库构建中存在的语义不全面、冗长的问题
[Abstract]:Automatic question answering system is an important task in the field of natural language processing. The corpus with "question and answer pair" as the basic component is the main source of the answer of the automatic question and answer system. The "question and answer pair" in the corpus is generally extracted from the community of questions and answers such as Baidu. However, a question in a Q & A community usually has multiple answers. The answer in the automatic Q & A community only selects one of the answers as the answer to the question, which leads to the incompleteness of the answers in the corpus. Therefore, this paper studies the method of answer fusion and combines multiple candidate answers to solve the problems of incomplete and redundant in the corpus of automatic question answering system. In this paper, the method of deep learning and attention mechanism are used to solve the problem of answer fusion. The method of answer fusion is to extract answers from multiple candidate answers, so the accuracy of answer extraction determines the accuracy and comprehensiveness of the results of answer fusion. At the same time, the solution is extracted from multiple candidate answers by the method of answer fusion, and there are some problems in semantic incoherence and poor readability. Therefore, this paper improves the result of answer fusion from two aspects: automatic answer extraction and semantic coherence. Automatic answer extraction can extract the answer sentence from multiple candidate answers, which makes the answer more concise and more comprehensive. Semantic coherence is usually expressed as sentence sequence in paragraphs, so sentence sorting method is used to solve the problem of semantic coherence of answers, to enhance semantic coherence between candidate answers, and to make the results of answer fusion more readable and semantic coherent. This paper focuses on automatic answer extraction and sentence sequencing, which is divided into four parts: 1, and the automatic answer extraction model based on word co-occurrence. In this paper, we use intra-sentence attention mechanism to extract the feature of question sentence and answer sentence, at the same time, we introduce word co-occurrence feature, document reciprocal feature, word similarity feature to the corpus. And the random sampling method is used to deal with the data imbalance in the corpus. Compared with the baseline method, the auto-extraction model based on word co-occurrence can improve the accuracy of the answer extraction by 0.2, and the sentence ranking method based on sentence matching. In this paper, the method of deep learning is introduced into sentence sorting, and the problem of sentence sorting is solved by using depth learning method. At the same time, the method of sentence matching is introduced into sentence sorting, and the baseline method is compared. The model improves the effect of sentence sort method. 3, and sentence sorting method based on attention mechanism. In order to enhance the ability of sentence sorting model to capture semantic logic relation, the attention mechanism is introduced into sentence sorting task, and a sentence sorting model based on static attention mechanism is implemented. Sentence ordering model based on word alignment attention mechanism and sentence sorting model based on intra-sentence attention mechanism. The method of sentence sorting based on attention mechanism can effectively capture the semantic logic relationship between sentences, improve the effect of sentence sorting. 4. The design and implementation of answer fusion system. The automatic answer extraction module and sentence sorting module are integrated to realize the answer fusion system, and to solve the problem of semantic incompleteness and verbosity in the construction of corpus.
【学位授予单位】:哈尔滨工业大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP391.1;TP181

【参考文献】

相关期刊论文 前9条

1 康世泽;马宏;黄瑞阳;;一种基于神经网络模型的句子排序方法[J];中文信息学报;2016年05期

2 刘秉权;徐振;刘峰;刘铭;孙承杰;王晓龙;;面向问答社区的答案摘要方法研究综述[J];中文信息学报;2016年01期

3 韩永峰;许旭阳;李弼程;朱武斌;陈刚;;基于事件抽取的网络新闻多文档自动摘要[J];中文信息学报;2012年01期

4 唐朝霞;;多特征融合的中文问答系统答案抽取算法[J];贵州大学学报(自然科学版);2011年05期

5 田卫东;祖永亮;;基于答案模式和语义特征融合的答案抽取方法[J];计算机工程与应用;2011年13期

6 徐永东;王亚东;刘杨;王伟;权光日;;多文档文摘中基于时间信息的句子排序策略研究[J];中文信息学报;2009年04期

7 余正涛;毛存礼;邓锦辉;章程;郭剑毅;;基于模式学习的中文问答系统答案抽取方法[J];吉林大学学报(工学版);2008年01期

8 刘里;曾庆田;;自动问答系统研究综述[J];山东科技大学学报(自然科学版);2007年04期

9 王作英,肖熙;基于段长分布的HMM语音识别模型[J];电子学报;2004年01期

相关硕士学位论文 前2条

1 赵惜墨;基于问句实体扩展和全局规划的答案摘要方法研究[D];哈尔滨工业大学;2015年

2 刘平安;基于HLDA模型的中文多文档摘要技术研究[D];北京邮电大学;2013年



本文编号:2221526

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/zidonghuakongzhilunwen/2221526.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户87fa8***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com