面向阅读理解复杂问题的句子融合
发布时间:2019-07-22 10:33
【摘要】:阅读理解是目前NLP领域的一个研究热点。阅读理解中好的复杂问题解答策略不仅要进行答案句的抽取,还要对答案句进行融合、生成相应的答案,但是目前的研究大多集中在前者。该文针对复杂问题解答中的句子融合进行研究,提出了一种兼顾句子重要信息、问题关联度与句子流畅度的句子融合方法。该方法的主要思想为:首先,基于句子拆分和词重要度选择待融合部分;然后,基于词对齐进行句子相同信息的合并;最后,利用基于依存关系、二元语言模型及词重要度的整数线性规划优化生成句子。在历年高考阅读理解数据集上的测试结果表明,该方法取得了82.62%的F值,同时更好地保证了结果的可读性及信息量。
[Abstract]:Reading comprehension is a hot research topic in the field of NLP at present. The good complex question solving strategy in reading comprehension not only needs to extract the answer sentence, but also merges the answer sentence to generate the corresponding answer, but most of the current research focuses on the former. In this paper, sentence fusion in complex problem solving is studied, and a sentence fusion method which takes into account the important information of sentence, the correlation degree of question and the fluency of sentence is proposed. The main ideas of this method are as follows: firstly, the fusion part is selected based on sentence resolution and word importance; then, the sentence is merged based on word alignment; finally, sentences are optimized by integer linear programming based on dependency, binary language model and word importance. The test results on the reading comprehension data set of college entrance examination over the years show that the method achieves 82.62% F value, and better ensures the readability and information of the results.
【作者单位】: 山西大学计算机与信息技术学院;山西大学计算智能与中文信息处理教育部重点实验室;
【基金】:国家高技术研究发展计划(863计划)项目(2015AA015407) 国家自然科学青年基金(61100138,61403238) 山西省自然科学基金(2011011016-2,2012021012-1) 山西省回国留学人员科研项目(2013-022) 山西省高校科技开发项目(20121117) 山西省2012年度留学回国人员科技活动择优项目
【分类号】:TP391.1
,
本文编号:2517566
[Abstract]:Reading comprehension is a hot research topic in the field of NLP at present. The good complex question solving strategy in reading comprehension not only needs to extract the answer sentence, but also merges the answer sentence to generate the corresponding answer, but most of the current research focuses on the former. In this paper, sentence fusion in complex problem solving is studied, and a sentence fusion method which takes into account the important information of sentence, the correlation degree of question and the fluency of sentence is proposed. The main ideas of this method are as follows: firstly, the fusion part is selected based on sentence resolution and word importance; then, the sentence is merged based on word alignment; finally, sentences are optimized by integer linear programming based on dependency, binary language model and word importance. The test results on the reading comprehension data set of college entrance examination over the years show that the method achieves 82.62% F value, and better ensures the readability and information of the results.
【作者单位】: 山西大学计算机与信息技术学院;山西大学计算智能与中文信息处理教育部重点实验室;
【基金】:国家高技术研究发展计划(863计划)项目(2015AA015407) 国家自然科学青年基金(61100138,61403238) 山西省自然科学基金(2011011016-2,2012021012-1) 山西省回国留学人员科研项目(2013-022) 山西省高校科技开发项目(20121117) 山西省2012年度留学回国人员科技活动择优项目
【分类号】:TP391.1
,
本文编号:2517566
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/2517566.html