基于深度学习的实体链接方法
[Abstract]:Text understanding plays an important role in the field of natural language processing and artificial intelligence. Entity link refers to extracting the entity reference from a text and mapping these entity references to the only entity in the specified knowledge base after disambiguation. The physical link can help the computer to find the important semantic information in the sentence, judge the different meaning of the words in different context, and is indispensable to help the computer understand the natural language. In this paper, an entity link method is proposed from two aspects, namely entity link based on property table of encyclopedic website and entity link based on pure text. In this paper, the basic natural language processing process, such as participle, word vector expression, document vector expression, synonym mapping and so on, is expounded from the perspective of human understanding of text, from word understanding to sentence understanding, and according to Baidu encyclopedia, interactive encyclopedia, etc. Chinese Wikipedia constructs a basic Chinese knowledge base. Firstly, a sort algorithm based on machine learning is proposed to deal with entity disambiguation in attribute tables, which aims to enhance the relevance of entities in Chinese knowledge base. Then, a text entity link algorithm based on deep learning neural network is proposed, in which entity reference is extracted from text and corresponding candidate set entity is found, and the model of entity reference and entity reference type is established by using bidirectional long and short term network. In order to train the similarity between entity reference and candidate set entity, the model of candidate set entity and entity type is established by using deep convolution neural network. Finally, we use document vector representation to supplement the global semantic representation of entity reference and candidate set entities, and combine disambiguation with graph model algorithm. We have tested the proposed method on Chinese and English datasets. The correct rate of 81.07% was obtained in the random chain disambiguation of Chinese Baidu encyclopedia, and the highest correct rate was obtained on the data set of the TAC-KBP International Competition for 4 years. Compared with other six algorithms, the highest Macro F1 is 61.47 on GERBIL entity tagging platform. Finally, this paper also shows the application of the algorithm in the actual website system.
【学位授予单位】:浙江大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP391.1
【相似文献】
相关期刊论文 前10条
1 冯山岭;李海军;张玉琴;王震;;二维零件图与三维实体图互生成方法探讨[J];机械研究与应用;2010年04期
2 陈斌;;结构化实体图——E-R方法的增强[J];计算机科学;1986年06期
3 庞正刚;;在Auto CAD中绘制相交线的新方法[J];重庆工贸职业技术学院学报;2006年02期
4 李灶福,李晓兰,邓小红,包晨阳;关于Auto CAD中将三维实体图转换成平面三视图的探讨[J];机床与液压;2003年03期
5 荣英;谭国萍;;CAD快速绘制组合体三维实体图的方法和技巧[J];九江学院学报(自然科学版);2013年03期
6 J Miguel Gerlso;张勤勇;;TM——一适合CAD和所要求的数据库功能的面向实体语言[J];国外导弹与航天运载器;1989年08期
7 焦泉忠;;NX5实体图与CAXA2007工程图转换[J];金属加工(冷加工);2013年02期
8 范力军;图形变量化的实现技术[J];工程设计CAD与智能建筑;1999年11期
9 王斌;;CAD三维实体解决复杂形体看图问题[J];实验室科学;2007年03期
10 杨长青;;AutoCAD三维实体教学体会[J];科技信息;2010年32期
相关博士学位论文 前1条
1 吴建华;矢量空间数据实体匹配方法与应用研究[D];武汉大学;2008年
相关硕士学位论文 前7条
1 薛昊原;领域文本资源实体链接算法研究[D];郑州大学;2015年
2 罗念;基于维基百科的实体链接算法研究及系统实现[D];华东师范大学;2016年
3 罗星;中文短文本实体识别和链接研究[D];华中师范大学;2016年
4 陈韶刚;面向领域的Web实体扩展及包装器健壮性优化[D];沈阳建筑大学;2015年
5 朱灿;实体解析技术研究与应用[D];上海交通大学;2015年
6 何峰权;基于属性模式的实体识别框架[D];哈尔滨工业大学;2013年
7 王玮;从可比语料中抽取等价实体翻译对的研究[D];哈尔滨工业大学;2014年
,本文编号:2218775
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/2218775.html