基于自然语言的知识查询算法研究

发布时间：2018-07-07 10:12

本文选题：知识查询 + 知识工程　；参考：《湖北大学》2013年硕士论文

【摘要】：随着社会发展的深入,人类对信息获取、知识查询的便捷及高效有了更高的需求。如此需求之下便产生了强大的动力,促使着广大的科研工作者在问答系统、自然语言查询以及搜索引擎等领域进行深入的理论研究和技术研发。其中,知识库系统中基于自然语言的知识查询,便是其中非常具有研究价值的一个新型领域,知识库系统的核心是知识库,而知识库中,知识表示和知识获取是其研究的核心；而在自然语言处理领域,知识查询算法也亦是其核心领域。知识查询算法中,最关键的是分词算法和匹配算法。论文在此背景下进行知识库系统中基于自然语言的知识查询算法研究。论文研究的理论基础主要有知识工程、自然语言处理、关系模型以及并行计算等相关理论。论文的主要创新点有以下几点： (一)知识查询算法中知识库部分,对语义网络知识表示方法以及关系模型的优劣点进行分析,提出了一种基于关系模型与语义网络相结合的知识表示方法,包括嵌套关系模型和链式关系模型两种逻辑表示。 (二)知识查询算法中的智能分词部分,对词库结构进行优化提出新颖的词库索引结构以及对正向最大匹配分词算法进行改进,介绍了一种改进的正向最大匹配分词算法。 (三)知识查询算法中句型模板匹配部分,基于数据结构中的树形结构,介绍了一种基于句型解析树的句型模板存储结构。 (四)基于句型解析树的存储结构,提出了句型模板粗匹配算法,其中包括过滤算法和树匹配算法,并对查询算法存在的问题提出了优化方案。论文基于上述四点,对基于自然语言(汉字文本)的知识查询进行纯理论研究,但论文的研究也暴露一些问题,主要有：第一,论文是基于纯汉字文本的自然语言查询研究,而当今的社会需求往往是基于汉字、数字、西语等多种文本字符的混合查询,论文的研究范围过窄；第二,论文是基于纯理论研究,对提出的算法只是通过伪码(或自然语言)的形式写出的,并没有通过程序设计实现算法、实验的形式对其性能进行验证和测试。因此在后续工作中应该加强对对算法的验证以及提出更好的优化方案。
[Abstract]:With the development of society, there is a higher demand for the convenience and efficiency of information acquisition and knowledge query. Under such a demand, a powerful motive force is produced, which urges the scientific research workers to carry on the deep theoretical research and the technical research and development in the question and answer system, the natural language inquiry and the search engine and so on. Among them, the knowledge query based on natural language in knowledge base system is a new field of research value. The core of knowledge base system is knowledge base, and knowledge representation and knowledge acquisition are the core of knowledge base research. In the field of natural language processing, knowledge query algorithm is also the core field. Among knowledge query algorithms, word segmentation algorithm and matching algorithm are the most important. In this context, the knowledge query algorithm based on natural language in knowledge base system is studied. The theoretical basis of this paper is knowledge engineering, natural language processing, relational model and parallel computing. The main innovations of this paper are as follows: (1) the knowledge base part of the knowledge query algorithm analyzes the advantages and disadvantages of the semantic network knowledge representation method and the relational model. A knowledge representation method based on the combination of relational model and semantic network is proposed, which includes two logical representations: nested relational model and chain relational model. (2) in the part of intelligent word segmentation in knowledge query algorithm, a novel index structure of lexicon and an improved algorithm for word segmentation with forward maximum matching are proposed, and an improved algorithm for word segmentation with maximum forward matching is introduced. (3) in the part of sentence pattern template matching in knowledge query algorithm, based on the tree structure of data structure, a sentence pattern template storage structure based on sentence pattern parsing tree is introduced. (4) based on the storage structure of sentence pattern parsing tree, the rough matching algorithm of sentence pattern template is proposed, which includes filtering algorithm and tree matching algorithm. Based on the above four points, this paper makes a pure theoretical study on the knowledge query based on natural language (Chinese character text), but the research of this paper also exposes some problems: first, the thesis is based on the natural language query of pure Chinese character text. Nowadays, the social needs are often based on the mixed query of Chinese characters, numbers, Spanish and other text characters. The research scope of this paper is too narrow. Secondly, the thesis is based on pure theory research. The proposed algorithm is only written in the form of pseudo code (or natural language), and the algorithm is not realized by programming. The performance of the algorithm is verified and tested in the form of experiment. Therefore, in the follow-up work, we should strengthen the verification of the algorithm and put forward a better optimization scheme.
【学位授予单位】：湖北大学
【学位级别】：硕士
【学位授予年份】：2013
【分类号】：TP391.3

【参考文献】