基于指代消解的汉语句群自动划分方法
发布时间:2019-04-28 19:57
【摘要】:汉语句群自动划分是将篇章划分成包含不同主题的文本片段,在信息提取、文摘生成、语篇理解及其他多个领域有着极为重要的应用。指代消解是识别篇章中先行词和照应词关联起来的过程,消解不同表达是自然语言理解的基础之一。针对目前的句群划分工作的重点在于划分出主题之间的边界而较少利用其本身指代关系来进行语言理解,或者因指代模糊而得到错误的划分结果的问题,提出了一种基于指代消解的句群自动划分方法。该方法从对篇章的指代情况消解出发,利用适合中文的多层过滤指代消解方法得到指代链信息,以消除不同名词代表相同实体、代词指代不明的问题。结合指代链信息,并同时考虑篇章衔接词因素,设计并进行了基于多元判别分析(Multiple Discriminate Analysis,MDA)的一组评价函数J评价句群划分验证实验。实验结果表明,所提出的方法能够有效地进行句群自动划分,统计正确分割平均Pμ提高了7%左右。
[Abstract]:Automatic segmentation of Chinese sentence groups is the division of text into text fragments containing different topics. It has very important applications in information extraction, abstracting generation, text understanding and many other fields. Referential resolution is a process of identifying antecedents and anaphora in a text. Dispelling different expressions is one of the bases of natural language understanding. In view of the problem that the focus of the current sentence group division is to divide the boundaries between the themes and make less use of its own referential relationship for language understanding, or because of the ambiguity of the reference, it is difficult to get the wrong classification results. In this paper, an automatic sentence group partition method based on reference resolution is proposed. In order to eliminate the problem that different nouns represent the same entity and pronouns are unknown, the reference chain information is obtained by using the multi-layer filtering referential resolution method which is suitable for Chinese to resolve the referential situation of the text. Combining with the information of reference chain and considering the factors of text conjunction, a set of evaluation function J evaluation sentence group partition verification experiments based on multivariate discriminant analysis (Multiple Discriminate Analysis,MDA) are designed and carried out. The experimental results show that the proposed method can automatically partition sentence groups effectively, and the average P 渭 of statistical correct segmentation is increased by about 7%.
【作者单位】: 杭州电子科技大学计算机学院;浙江大学软件学院;
【基金】:国家自然科学基金资助项目(61202281,61103101) 教育部人文社会科学研究项目青年基金(10YJCZH052,12YJCZH201)
【分类号】:TP391.1
本文编号:2467920
[Abstract]:Automatic segmentation of Chinese sentence groups is the division of text into text fragments containing different topics. It has very important applications in information extraction, abstracting generation, text understanding and many other fields. Referential resolution is a process of identifying antecedents and anaphora in a text. Dispelling different expressions is one of the bases of natural language understanding. In view of the problem that the focus of the current sentence group division is to divide the boundaries between the themes and make less use of its own referential relationship for language understanding, or because of the ambiguity of the reference, it is difficult to get the wrong classification results. In this paper, an automatic sentence group partition method based on reference resolution is proposed. In order to eliminate the problem that different nouns represent the same entity and pronouns are unknown, the reference chain information is obtained by using the multi-layer filtering referential resolution method which is suitable for Chinese to resolve the referential situation of the text. Combining with the information of reference chain and considering the factors of text conjunction, a set of evaluation function J evaluation sentence group partition verification experiments based on multivariate discriminant analysis (Multiple Discriminate Analysis,MDA) are designed and carried out. The experimental results show that the proposed method can automatically partition sentence groups effectively, and the average P 渭 of statistical correct segmentation is increased by about 7%.
【作者单位】: 杭州电子科技大学计算机学院;浙江大学软件学院;
【基金】:国家自然科学基金资助项目(61202281,61103101) 教育部人文社会科学研究项目青年基金(10YJCZH052,12YJCZH201)
【分类号】:TP391.1
【相似文献】
相关期刊论文 前6条
1 缪建明;张全;;现代汉语句群处理研究的进展[J];微计算机应用;2009年12期
2 刘淑荣;;试论句群分析在播音中的重要意义[J];广播歌选;2009年12期
3 韦向峰;缪建明;张全;池毓焕;;基于概念基元的句群情景框架抽取研究[J];微计算机应用;2010年04期
4 韦向峰;缪建明;张全;;汉语句群领域的自动抽取研究[J];计算机工程与应用;2009年04期
5 吴晨;张全;;自然语言处理中句群划分及其判定规则研究[J];计算机工程;2007年04期
6 李颖;韦向峰;池毓焕;;句群情景框架在搜索引擎中的应用[J];现代计算机;2013年08期
相关会议论文 前2条
1 韦向峰;;句群小句的语义块共享研究[A];第八届全国人机语音通讯学术会议论文集[C];2005年
2 缪建明;张全;;HNC句群处理研究新进展[A];中国计算机语言学研究前沿进展(2007-2009)[C];2009年
相关硕士学位论文 前1条
1 张璐瑶;汉语句群自动划分方法研究及应用[D];杭州电子科技大学;2016年
,本文编号:2467920
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/2467920.html