当前位置:主页 > 科技论文 > 搜索引擎论文 >

面向地域信息的问答系统研究与实现

发布时间:2018-03-19 20:51

  本文选题:信息检索 切入点:知识库 出处:《西南交通大学》2013年硕士论文 论文类型:学位论文


【摘要】:互联网已经成为当今人们获取信息的重要渠道,这也使得搜索引擎技术成为互联网中极其重要的技术,但传统的搜索引擎不能一次性返回给用户准确的信息。问答系统作为信息检索的一种新形式,能够弥补传统搜索引擎的诸多不足,因而逐渐受到人们的重视。本文对基于特定领域的问答系统进行了相关的研究和设计,主要包括结构化知识库的构建、问句的分析及理解以及答案抽取技术的研究,最后实现了面向地域信息的问答系统的原型系统。 在结构化知识库的构建方面,对互联网上大量的与地域相关的信息进行了下载整理,运用信息抽取技术构建了面向地域信息的结构化知识库,该知识库可以支持简单的与地域相关信息的检索。设计了可以通过用户行为自动添加的问答库,利用该问答库可以进一步支持问答系统快速、准确的检索。 在问句分析与理解方面,使用对问句进行属性标注、模式判断等方法进行问句分析。并且深入研究了基于《知网》的语义相似度计算方法,针对《知网》未登录词不能参加计算的问题做了相关的处理,在对基本地域信息结构化知识库的语义检索中提高了准确率和召回率。通过实验比较,确定采用基于《知网》的句子相似度计算算法进行问答库检索。 在答案抽取方面,对知识库的答案检索采用了提取问句属性块,利用属性块检索答案的方法。由于本地数据库始终存在着局限性,而互联网作为巨大的信息集成体,是可以利用的数据源,因此本文设计了基于互联网的答案抽取模块,并且提出了基于向量空间模型的网络答案抽取算法,该模块充分考虑了搜索引擎与网页文档的特点,实验证明其具有较高的准确率。 针对所设计的问答系统的检索流程,实现了问答系统的原型系统,该系统主要由问句分析、语义相似度计算、知识库检索、问题库管理以及互联网检索等模块组成。并且利用Google地图对相关检索结果的地理位置做了标记。本文针对地域相关信息,完整的实现了从数据采集、信息结构化到语义检索的过程。达到了预期的目的。
[Abstract]:The Internet has become an important channel for people to obtain information, which makes search engine technology become an extremely important technology in the Internet. But traditional search engines can not return accurate information to users at once. As a new form of information retrieval, question-and-answer system can make up for many shortcomings of traditional search engines. As a result, people pay more and more attention to it. In this paper, a question answering system based on a specific field has been studied and designed, including the construction of a structured knowledge base, the analysis and understanding of question sentences, and the research on the technology of answer extraction. Finally, the prototype system of question and answer system oriented to regional information is implemented. In the construction of structured knowledge base, a large number of information related to the region on the Internet are downloaded and sorted, and a structured knowledge base oriented to regional information is constructed by using the technology of information extraction. The knowledge base can support the retrieval of the information related to the region, and a question and answer library which can be automatically added by user behavior is designed, which can be used to further support the quick and accurate retrieval of the question and answer system. In the aspect of question analysis and understanding, we use the methods of attribute tagging, pattern judgment and so on to analyze question sentences, and deeply study the semantic similarity calculation method based on the knowledge Web. This paper deals with the problem that unregistered words cannot participate in the computation, and improves the accuracy and recall rate in the semantic retrieval of the structured knowledge base of basic regional information. A sentence similarity calculation algorithm based on KnowledgeNet is adopted for question and answer database retrieval. In the aspect of answer extraction, the method of extracting question sentence attribute block and using attribute block is used to retrieve the answer in the knowledge base. Because of the limitation of the local database, the Internet is a huge information integration body. Therefore, this paper designs an Internet-based answer extraction module, and proposes a vector space model based network answer extraction algorithm, which takes full account of the characteristics of search engines and web documents. Experimental results show that it has a high accuracy. According to the retrieval flow of the question answering system designed, the prototype system of question answering system is implemented. The system is mainly composed of question analysis, semantic similarity calculation, knowledge base retrieval, etc. The Google map is used to mark the geographical location of the related retrieval results. The process of information structure to semantic retrieval.
【学位授予单位】:西南交通大学
【学位级别】:硕士
【学位授予年份】:2013
【分类号】:TP391.3

【参考文献】

相关期刊论文 前10条

1 刘文华;康海燕;;领域问答系统生成器的研究[J];北京信息科技大学学报(自然科学版);2009年03期

2 樊孝忠,李宏乔,李良富,叶江;银行领域汉语自动问答系统BAQS的研究与实现[J];北京理工大学学报;2004年06期

3 田久乐;赵蔚;;基于同义词词林的词语相似度计算方法[J];吉林大学学报(信息科学版);2010年06期

4 秦兵,刘挺,王洋,郑实福,李生;基于常问问题集的中文问答系统研究[J];哈尔滨工业大学学报;2003年10期

5 余正涛,樊孝忠,郭剑毅;基于支持向量机的汉语问句分类[J];华南理工大学学报(自然科学版);2005年09期

6 王宇;战学刚;蔡建山;;基于网络的中文问答系统的研究[J];计算机工程与应用;2006年07期

7 周法国;杨炳儒;;句子相似度计算新方法及在问答系统中的应用[J];计算机工程与应用;2008年01期

8 张永奎,赵辄谦,白丽君,陈鑫卿;基于互联网的中文问答系统[J];计算机工程;2003年15期

9 姜吉发;开放领域汉语知识问答方法[J];计算机工程;2005年11期

10 夏天;;汉语词语语义相似度计算研究[J];计算机工程;2007年06期



本文编号:1635945

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/sousuoyinqinglunwen/1635945.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户5a1fb***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com