当前位置:主页 > 科技论文 > 软件论文 >

基于本体的茶学知识表示与应用的研究

发布时间:2018-08-24 18:12
【摘要】:互联网时代,随着信息技术的快速发展,知识正呈现海量、多源、异构化趋势,如何对知识进行组织管理从而有效获取是信息检索领域的研究热点,本体作为一种新型的知识组织工具,具有良好表示语义关系且支持逻辑推理的特点,得到广泛的应用。茶是世界三大无酒精饮料之一,种植区域遍布全球,中国作为茶叶发源地,有着悠久的茶学研究历史,茶学知识涉及栽培、生物化学、病虫害、检验学、机械学、文化习俗、产业经济等众多领域,在此技术和知识背景下,本文以丰富的茶学知识作为研究对象,采用本体技术实现茶学知识的组织以及检索系统的本体应用,本文主要可分为三个部分:第一部分,本文首先对本体的定义、分类、应用进行了学习,又深入了解知识经济社会中组织工具的发展,对比分析各个组织工具的优势与不足,指出本体在信息组织方面受到重点关注,由于本文研究对象为茶学,属于农学一部分,因此对农学本体研究现状也做了调查分析,本体构建理论基础知识如构建方法、编辑工具和开发工具也都一一进行了学习以供后续茶学本体的构建。第二部分,在调查本体人工构建费时费力、专家依赖性强等不足后采用本体学习方法对茶学本体进行半自动构建。在对本体学习方法深入分析后,运用本体构建方法中的“七步法”和“骨架法”构建茶学本体,首先使用ICTCLAS分词系统将获取语料进行分词处理和词性标注,编写程序完成指定词性和停用词的删除,其次采用TF-IDF方法实现基于权重的特征词筛选来抽取茶学概念,获得候选概念集,并结合叙词表、茶叶辞典和领域专家进行术语规范和补充,然后依据关联规则挖掘方法设定支持度、置信度阈值来识别概念间关系,通过以上主要步骤获得茶学本体相应的类、属性、实例,利用本体编辑软件Prot e ge完成形式化表示,主要有类层次的确定、对象属性定义域和值域的设置、数据属性的限制等,并加入本体评价与优化步骤,由Prot ege自带HermiT推理机进行逻辑一致性检测,力证所构建茶学本体的合理性。第三部分,基于茶学本体实现知识检索方面的应用,首先阐述了传统信息检索存在的用户忠实表达难、词形匹配、词汇孤岛的局限性以及知识检索所具有的语义匹配、智能推理的优势,其次探讨了基于茶学本体知识检索关键技术的解决,包括扩展查询功能、信息资源标引功能、资源检索功能的实现,具体是运用Jena语义包进行本体的读取和解析,Ecl ipse开发工具界面的编写使得检索系统在基于关键词的检索方法中实现了同义词、上位词、关系词的语义扩展,提高了一定程度的查全率和查准率。
[Abstract]:In the Internet era, with the rapid development of information technology, knowledge is showing the trend of mass, multi-source, isomerization. How to organize and manage knowledge to obtain effectively is the research hotspot in the field of information retrieval. As a new knowledge organization tool ontology is widely used because of its good representation of semantic relations and the support of logical reasoning. Tea is one of the three largest non-alcoholic beverages in the world. Tea is grown all over the world. As the birthplace of tea, China has a long history of tea research. Tea knowledge involves cultivation, biochemistry, diseases and insect pests, laboratory studies, mechanics, and cultural practices. Under the background of this technology and knowledge, this paper takes the abundant tea knowledge as the research object, uses the ontology technology to realize the tea science knowledge organization and the retrieval system ontology application. This paper can be divided into three parts: the first part, this paper first of all, the definition of ontology, classification, application of learning, but also in-depth understanding of the development of organizational tools in the knowledge economy society, comparative analysis of the advantages and disadvantages of each organizational tool. It is pointed out that ontology is paid more attention to in the field of information organization. Because the research object of this paper is tea science, it is a part of agronomy. Therefore, the present situation of agricultural ontology research is also investigated and analyzed. The basic knowledge of ontology construction theory such as construction method is also investigated and analyzed. Editing tools and development tools are also studied to follow up the construction of tea ontology. The second part uses ontology learning method to construct tea ontology semi-automatically after investigating the disadvantages of artificial construction and expert dependence. After deeply analyzing the ontology learning methods, using the "seven steps" and "skeleton method" in the ontology construction method to construct the tea ontology, we first use the ICTCLAS word segmentation system to deal with the word segmentation and the part of speech tagging. The program was written to complete the deletion of designated parts of speech and stop words. Secondly, TF-IDF method was used to carry out the feature word selection based on weight to extract tea concept, to obtain candidate concept set, and to combine with thesaurus. The tea dictionaries and domain experts standardize and supplement the terms, then set the support degree and confidence threshold to identify the relationship between concepts according to the association rules mining method, and obtain the corresponding classes, attributes and examples of tea ontology through the above main steps. The formal representation is accomplished by using ontology editing software Prot e ge, which mainly includes the determination of class level, the setting of object attribute domain and range, the limitation of data attribute, and the steps of ontology evaluation and optimization. The logic consistency is checked by Prot ege's own HermiT inference machine, and the rationality of tea ontology is proved. In the third part, the application of knowledge retrieval based on tea ontology is discussed. Firstly, the difficulties of user faithful expression, word shape matching, the limitation of vocabulary island and the semantic matching of traditional information retrieval are expounded. The advantages of intelligent reasoning. Secondly, the key technologies of tea ontology knowledge retrieval are discussed, including the expansion of query function, the indexing of information resources and the realization of resource retrieval. Specifically, Jena semantic package is used for ontology reading and parsing ECL ipse development tool interface, which makes the retrieval system realize the semantic extension of synonyms, upper words and relational words in the retrieval method based on keywords. Improved a certain degree of recall and precision.
【学位授予单位】:南京农业大学
【学位级别】:硕士
【学位授予年份】:2016
【分类号】:TP391.3

【参考文献】

相关期刊论文 前10条

1 周军;曾小军;管珊红;武睿;黄细红;;我国茶学类期刊学术影响力分析[J];江西农业学报;2014年12期

2 李倩;;论信息组织的新技术与新方法[J];情报探索;2013年11期

3 何来坤;缪健美;刘礼芳;潘红;;基于Ontology与Jena的研究综述[J];杭州师范大学学报(自然科学版);2013年05期

4 王君君;李瑾;;农村信息服务影响因素分析[J];湖北农业科学;2012年14期

5 刘萍;胡月红;;领域本体学习方法和技术研究综述[J];现代图书情报技术;2012年01期

6 张红艳;都娟;;关联规则中Apriori算法的应用[J];数字技术与应用;2011年08期

7 许高建;;茶虫领域本体构建及其应用研究[J];苏州大学学报(工科版);2011年02期

8 奉国和;郑伟;;国内中文自动分词技术研究综述[J];图书情报工作;2011年02期

9 林潇;李绍稳;张友华;辜丽川;朱诚;倪冬平;;基于本体的水稻病害诊断专家系统研究[J];数字技术与应用;2010年11期

10 李远华;;我国茶学学科发展的思考[J];高等农业教育;2010年09期

相关博士学位论文 前1条

1 何琳;古农学本体的半自动构建及检索研究[D];南京农业大学;2007年

相关硕士学位论文 前9条

1 林秀花;基于文化视角的中国茶产业可持续发展研究[D];北京工商大学;2014年

2 戴才萍;水稻病虫草害本体知识组织体系的构建研究[D];安徽农业大学;2011年

3 李大鹏;基于本体的学科知识地图构建研究[D];华中师范大学;2011年

4 李丹丹;基于本体的知识表示及信息检索研究[D];西南交通大学;2011年

5 孙奎;基于本体的果树病虫害知识表示与推理的研究[D];辽宁工程技术大学;2011年

6 王莉;基于本体的知识检索系统研究与实现[D];中国海洋大学;2008年

7 邹文科;基于本体技术的语义检索及其语义相似度研究[D];北京邮电大学;2008年

8 伊雯雯;专利信息检索系统中本体半自动构建的研究与应用[D];苏州大学;2008年

9 陈琮;基于Jena的本体检索模型设计与实现[D];武汉大学;2005年



本文编号:2201629

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/2201629.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户e1791***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com