当前位置:主页 > 科技论文 > 搜索引擎论文 >

基于语义的文本事件信息抽取方法的研究与实现

发布时间:2018-05-14 21:21

  本文选题:事件抽取 + 语义处理 ; 参考:《上海交通大学》2012年硕士论文


【摘要】:事件抽取和追踪是自然语言处理领域一个非常重要的研究方向,如何准确而高效地从大量繁杂无序的信息中提取到感兴趣的事件信息,一直是事件抽取研究领域的关键问题。 一般而言,事件抽取就是从非结构化文档中抽取出用户感兴趣的事件,同时用结构化形式描述,供用户查询和进一步追踪分析等。事件抽取的研究对象会选取某一个固定领域或者新闻文本,这样更符合用户对于事件抽取的想象。并且事件抽取的形式也比较固定和单一,一般会采取基于模板匹配提取结构化文本或分析文本段落等进行分类的方法。 本课题基于时空元素语义搜索引擎的研究背景,提出了一种基于语义的文本事件信息抽取方法,创新地通过应用多方面语义知识和统计方法,强调时、空元素对于事件追踪的定位功能,进行信息抽取和归并,最终实现对文本中事件的描述。 该课题的处理文本类型多样,结构与行文风格复杂,如果采用传统的方法达不到理想的结果。而在实际应用中,这种情况非常常见。本文目标明确,方法有效且不繁琐,结合语义知识和统计学习,对处理复杂语料和大规模数据有着非常明显的优势。 另外,在本文中涉及到多方面自然语言处理的相关概念和算法研究,可以说,通过本课题对自然语言处理的研究,尤其是对信息抽取的研究有了深刻的认识与感悟。
[Abstract]:Event extraction and tracking is a very important research field in the field of natural language processing. How to accurately and efficiently extract the event information from a large number of complex and disordered information has been a key issue in the field of event extraction. In general, event extraction is to extract events of interest to users from unstructured documents, and describe them in structured form for users to query and further trace and analyze. The research object of event extraction will select a fixed field or news text, which is more in line with the user's imagination of event extraction. And the form of event extraction is also fixed and single. Generally, the method of extracting structured text or analyzing text paragraphs based on template matching is used for classification. Based on the research background of Spatio-temporal element semantic search engine, this paper proposes a semantic-based text event information extraction method, which emphasizes time by applying various semantic knowledge and statistical methods. The empty element can extract and merge the information for the locating function of event tracing, and finally realize the description of the event in the text. There are various types of text and complex structure and style of writing. If the traditional method is adopted, the ideal results can not be achieved. In practical applications, this situation is very common. The purpose of this paper is clear, the method is effective and not tedious, and combining semantic knowledge and statistical learning, it has a very obvious advantage in dealing with complex corpus and large-scale data. In addition, this paper involves a variety of natural language processing related concepts and algorithms, we can say that through this topic of natural language processing research, especially the study of information extraction has a profound understanding and understanding.
【学位授予单位】:上海交通大学
【学位级别】:硕士
【学位授予年份】:2012
【分类号】:TP391.1

【引证文献】

相关硕士学位论文 前1条

1 幸小然;基于本体的电影院NFC智能应用系统的设计与实现[D];电子科技大学;2013年



本文编号:1889463

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/sousuoyinqinglunwen/1889463.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户4e1c0***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com