基于汉语连动句的常识获取方法研究
本文选题:连动句 + 事件语义类 ; 参考:《江苏科技大学》2017年硕士论文
【摘要】:常识知识获取是人工智能领域一个重要研究课题也是一个长期存在的挑战。其目标是构建面向应用的大规模常识知识库以实现真正的智能系统。事件的前提常识和后果常识作为两种重要的常识知识,在自动问答、自然语言理解、信息检索等领域都具有极大的应用价值。但是由于常识知识具备隐含性、泛在性和基础性等特点,机器无法自动获取大量隐含的常识知识。连动句是现代汉语中一种常见句式,每个连动句都包含两个或两个以上谓语动词且这两个谓语动词是相互依赖的,它们具备目的、因果、方式等语义关系。一个谓语动词即一个事件,因而连动句是描述多个事件的特殊句式。连动句中的事件具有多种语义关系,所以连动句蕴含了丰富的事件常识。连动句在人类描述语言中大量存在,句式简单且有模式可循。因此,连动句可作为一个大规模易获取的知识源,为海量常识的获取提供契机。针对上述问题,本文系统地研究了从汉语连动句中获取前提常识和后果常识的理论和方法,具体研究内容包括以下三个方面:首先研究连动句识别方法,本文给出一种基于规则与统计的汉语连动句识别方法。为了实现连动句自动识别,该方法从连动句形式特征和语义角色两个角度构建基础规则库,利用统计学方法计算两个谓语动词之间的中间词的特征词性是被动名词的概率。实验表明,基于规则和统计的方法准确率达到75.48%,相较于仅基于规则的识别方法提高了14.46%。然后研究连动文法构建方法,本文以事件语义类文法为基础,利用连动句的语义特征和句法结构,构建了自动生成连动文法规则,为基于连动句的常识获取提供理论基础。最后研究基于连动句的常识获取方法,本文给出了四种基于汉语连动句的常识获取方法,分别是:通过连动词对的语义获取常识、通过连动文法的事元角色获取常识、通过常识知识角度获取常识和通过连动句的类型获取常识。然后,基于以上四种方法设计了七种问题模板及交互脚本,以交互的方式提问并引导知识工程师获取常识。为了论证交互过程的合理性,本文给出了基于二项分布假设检验的定量评估模型来验证交互过程的可接受性和有效性。实验表明,利用本文方法获取常识,知识正确率达到92.5%。
[Abstract]:The acquisition of common sense knowledge is an important research topic in the field of artificial intelligence and a long-standing challenge. The goal is to build an application oriented large scale knowledge base to achieve real intelligent systems. The precondition of the event and the common sense of the consequences are two important common sense knowledge, in automatic question and answer, natural language understanding, and information inspection. The fields of cable are of great value in application. However, because of the implicit, ubiquitous and basic characteristics of common sense knowledge, the machine can not automatically obtain a large number of implicit knowledge. The sentence is a common sentence in modern Chinese, each of which contains two or more than two predicate verbs and the two predicate verbs are the phase. Interdependence, they have semantic relations, such as purpose, causation and way. A predicate verb is an event, so the verb is a special sentence pattern describing many events. The event in the sentence has a variety of semantic relations, so the connection sentence contains a lot of common sense of events. Therefore, the model can be used as a large and easy access knowledge source, which provides an opportunity for the acquisition of mass common sense. In this paper, this paper systematically studies the theory and method of obtaining the common sense and common sense of the precondition from the Chinese serial sentence. The specific research contents include the following three aspects: first of all, the study of the serial sentence. In order to realize the automatic recognition of continuous sentences, this method constructs the basic rule library from two angles of the form feature and the semantic role of the continuous verb sentence. The statistical method is used to calculate the characteristics of the middle word between the two predicate verbs, which is the probability of the passive noun. The experiment shows that the accuracy of the method based on rules and statistics is up to 75.48%. Compared to the rule based recognition method, the method is improved by 14.46%. and then the construction method of continuous grammar is studied. Based on the semantic feature and syntactic structure of the syntactic sentence, this paper constructs the automatic generation of continuous dynamic grammar rules, which is based on connection. The common sense acquisition of dynamic sentences provides a theoretical basis. Finally, the common sense acquisition method based on continuous sentences is studied. In this paper, four methods of common sense acquisition based on Chinese continuous verb are given. Then, the seven problem templates and interactive scripts are designed based on the above four methods, and the knowledge engineers are asked to interactively ask and guide the knowledge engineers to obtain common sense. In order to demonstrate the rationality of the interaction process, this paper gives a quantitative evaluation model based on the two distribution hypothesis testing to verify the acceptability of the interactive process. The experiments show that this method can acquire common sense and the accuracy of knowledge is 92.5%..
【学位授予单位】:江苏科技大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP18;TP391.1
【参考文献】
相关期刊论文 前10条
1 曹聪;曹存根;臧良军;王石;;一种交互式事件常识知识的获取方法[J];中文信息学报;2016年03期
2 吴宏洲;;分词技术的研究与应用——一种快速分词的实现[J];电脑知识与技术;2015年06期
3 王亚;陈龙;曹聪;王驹;曹存根;;事件常识的获取方法研究[J];计算机科学;2015年10期
4 李致远;冯志勇;王鑫;李元放;饶国政;;基于本体指标的本体版本演变分析方法[J];计算机科学与探索;2016年02期
5 CHEN Bo;Lü Chen;WEI Xiaomei;JI Donghong;;Chinese Semantic Parsing Based on Feature Structure with Recursive Directed Graph[J];Wuhan University Journal of Natural Sciences;2015年04期
6 皇甫素飞;;紧缩构式的界定及其句法结构分析[J];浙江工商大学学报;2014年05期
7 储丽莎;;“连动式”浅说[J];现代语文(语言研究版);2013年11期
8 张旭洁;刘宗田;刘炜;苏小英;廖涛;;事件与事件本体模型研究综述[J];计算机工程;2013年09期
9 陈波;姬东鸿;吕晨;;基于特征结构的汉语连动句语义标注研究[J];中文信息学报;2013年05期
10 张恒;;动结式、V得句和兼语句的比较[J];汉语学习;2013年04期
相关博士学位论文 前2条
1 周文;基于概念的若干知识表示模型及相关方法研究[D];上海大学;2007年
2 田雯;人类心理常识的形式化研究[D];中国科学院研究生院(计算技术研究所);2004年
相关硕士学位论文 前4条
1 王亚;基于语义分类的常识知识获取方法研究[D];广西师范大学;2015年
2 李闪闪;支持汉语语句深层分析的本体研究[D];首都师范大学;2013年
3 孙晓华;现代汉语连动句及其习得研究[D];南京师范大学;2008年
4 朱耀;从大规模Web语料中获取常识语料[D];中国科学院研究生院(计算技术研究所);2008年
,本文编号:1858937
本文链接:https://www.wllwen.com/kejilunwen/zidonghuakongzhilunwen/1858937.html