高阶句法模型
发布时间:2018-08-26 14:23
【摘要】:句法分析是计算语言学的一个重要的领域,它是连接语法和语义之间的一个必要桥梁。汉语由于其表意性较印欧语系更为突出,造成了语句成分离散、句法规则灵活等现象,使得基于汉语的句法分析变得尤为复杂。在自然语言处理中,句法分析的首要前提是建立一个完善的、具有本语言针对性,同时能够兼容和概括其他语言的句法模型体系,这一模型要能够描述和抽象句子的语法结构、成分关系,乃至浅层的语义逻辑,同时,其标注体系,也就是表达方式要全面、灵活,并且易于机器处理。 传统句法模型包括短语结构句法模型、依存语法、格语法、框架网络等,它们都从各自擅长了领域解决句法分析方面的问题,但由于自身体系存在一定程度的缺陷,在实际应用中暴露出了许多问题,难以令人满意地进行自然语言的句法表达和分析。 本文在充分研究语言学、计算语言学等领域前人成果的基础上,对已有的句法模型进行重大改进,试图从更本质的视角去进行句法表达和分析,并在此基础上,建立了高阶句法模型。高阶句法模型由词性标注体系、标点符号标注体系、转换生成规则、结构标注体系、标记转换集、介词短语集和关系标注体系等七个部分组成。其中,词性标注体系涵盖3个层次67种标注符号;标点符号标注体系包含18中标注符号,且与结构体系和关系体系进行了深层次的连接;结构标注体系概括了11种常用语法结构;关系标注体系则建立3个层次33种语义逻辑关系。这七个部分各有侧重、由浅入深,从不同的视角出发,形成了从词性标注到语义逻辑理解的涵盖若干句法分析层级的整体。解决了句法分析实践中的许多问题,从更深层的视角对句法这一基础性问题进行了诠释。
[Abstract]:Syntactic analysis is an important field of computational linguistics. It is a necessary bridge between grammar and semantics. Chinese is more prominent than Indo-European because of the discrete sentence components and flexible syntactic rules, which makes the syntactic analysis based on Chinese more complicated. In natural language processing, the first prerequisite of syntactic analysis is to establish a perfect syntactic model system that is specific to this language and can also be compatible with and generalize other languages. This model should be able to describe and abstract the grammatical structure of sentences. Component relations and even shallow semantic logic, at the same time, its annotation system, that is, the way of expression should be comprehensive, flexible, and easy to machine processing. The traditional syntactic models include phrase structure syntactic model, dependent grammar, case grammar, frame network and so on. All of them are good at solving the problems of syntactic analysis from their respective fields, but there are some defects in their own system. Many problems have been exposed in practical application, and it is difficult to carry out syntactic expression and analysis of natural language satisfactorily. Based on the study of the previous achievements in linguistics, computational linguistics and other fields, this paper attempts to improve the existing syntactic models, and attempts to express and analyze syntax from a more essential perspective. A higher order syntactic model is established. The higher-order syntax model consists of seven parts: part of speech tagging system, punctuation annotation system, transformation generation rule, structural tagging system, label transformation set, prepositional phrase set and relational tagging system. Among them, the part of speech tagging system includes 67 kinds of tagging symbols at three levels, and the punctuation marking system includes 18 tagged symbols, which are deeply connected with the structural system and the relational system. The structure tagging system summarizes 11 common grammatical structures, while the relational annotation system establishes 33 semantic logical relationships at three levels. These seven parts have their own emphases, from simple to deep, from different angles of view, forming a whole covering several levels of syntactic analysis from part of speech tagging to semantic logic understanding. It solves many problems in the practice of syntactic analysis and interprets the basic problem of syntax from a deeper perspective.
【学位授予单位】:北京邮电大学
【学位级别】:硕士
【学位授予年份】:2013
【分类号】:H087;H146.3
本文编号:2205141
[Abstract]:Syntactic analysis is an important field of computational linguistics. It is a necessary bridge between grammar and semantics. Chinese is more prominent than Indo-European because of the discrete sentence components and flexible syntactic rules, which makes the syntactic analysis based on Chinese more complicated. In natural language processing, the first prerequisite of syntactic analysis is to establish a perfect syntactic model system that is specific to this language and can also be compatible with and generalize other languages. This model should be able to describe and abstract the grammatical structure of sentences. Component relations and even shallow semantic logic, at the same time, its annotation system, that is, the way of expression should be comprehensive, flexible, and easy to machine processing. The traditional syntactic models include phrase structure syntactic model, dependent grammar, case grammar, frame network and so on. All of them are good at solving the problems of syntactic analysis from their respective fields, but there are some defects in their own system. Many problems have been exposed in practical application, and it is difficult to carry out syntactic expression and analysis of natural language satisfactorily. Based on the study of the previous achievements in linguistics, computational linguistics and other fields, this paper attempts to improve the existing syntactic models, and attempts to express and analyze syntax from a more essential perspective. A higher order syntactic model is established. The higher-order syntax model consists of seven parts: part of speech tagging system, punctuation annotation system, transformation generation rule, structural tagging system, label transformation set, prepositional phrase set and relational tagging system. Among them, the part of speech tagging system includes 67 kinds of tagging symbols at three levels, and the punctuation marking system includes 18 tagged symbols, which are deeply connected with the structural system and the relational system. The structure tagging system summarizes 11 common grammatical structures, while the relational annotation system establishes 33 semantic logical relationships at three levels. These seven parts have their own emphases, from simple to deep, from different angles of view, forming a whole covering several levels of syntactic analysis from part of speech tagging to semantic logic understanding. It solves many problems in the practice of syntactic analysis and interprets the basic problem of syntax from a deeper perspective.
【学位授予单位】:北京邮电大学
【学位级别】:硕士
【学位授予年份】:2013
【分类号】:H087;H146.3
【参考文献】
相关期刊论文 前9条
1 周明,,黄昌宁,张敏,白栓虎,吴升;统计与规则并举的汉语句法分析模型[J];计算机研究与发展;1994年02期
2 邵敬敏;“语义语法”说略[J];暨南学报(人文科学与社会科学版);2004年01期
3 黄昌宁,苑春法,潘诗梅;语料库、知识获取和句法分析[J];中文信息学报;1992年03期
4 尤f ,李涓子,王作英;基于语义依存关系的汉语语料库的构建[J];中文信息学报;2003年01期
5 孟遥,李生,赵铁军,曹海龙;四种基本统计句法分析模型在汉语句法分析中的性能比较[J];中文信息学报;2003年03期
6 周强;汉语句法树库标注体系[J];中文信息学报;2004年04期
7 由丽萍,范开泰,刘开瑛;汉语语义分析模型研究述评[J];中文信息学报;2005年06期
8 周明,黄昌宁;面向语料库标注的汉语依存体系的探讨[J];中文信息学报;1994年03期
9 周国辉;格语法与汉语非常规谓宾结构[J];外语与外语教学;2003年07期
本文编号:2205141
本文链接:https://www.wllwen.com/wenyilunwen/yuyanxuelw/2205141.html