基于位置敏感Embedding的中文命名实体识别
发布时间:2018-09-10 06:54
【摘要】:在基于条件随机场的中文命名实体识别任务中,现有表示学习方法学习到的特征存在语义表示偏差,给中文命名实体识别带来噪声。针对此问题,提出了一种基于位置敏感Embedding的中文命名实体识别方法。该方法将上下文位置信息融入到现有的Embedding模型中,采用多尺度聚类方法抽取不同粒度的Embedding特征,通过条件随机场来识别中文命名实体。实验证明,该方法学习到的特征缓解了语义表示偏差,进一步提高了现有系统的性能,与传统方法相比,F值提高了2.85%。
[Abstract]:In the task of Chinese named entity recognition based on conditional Random Field, the existing representation learning methods have semantic representation bias, which brings noise to Chinese named entity recognition. To solve this problem, a Chinese named entity recognition method based on location-sensitive Embedding is proposed. In this method, the context location information is incorporated into the existing Embedding model, and the multi-scale clustering method is used to extract the Embedding features of different granularity, and the Chinese named entities are identified by conditional random fields. Experimental results show that the proposed method can alleviate the deviation of semantic representation and further improve the performance of the existing system. Compared with the traditional method, the value of F is increased by 2.85%.
【作者单位】: 武汉大学计算机学院;
【基金】:国家自然科学基金重点项目(61133012);国家自然科学基金面上项目(61373108)
【分类号】:TP391.1
[Abstract]:In the task of Chinese named entity recognition based on conditional Random Field, the existing representation learning methods have semantic representation bias, which brings noise to Chinese named entity recognition. To solve this problem, a Chinese named entity recognition method based on location-sensitive Embedding is proposed. In this method, the context location information is incorporated into the existing Embedding model, and the multi-scale clustering method is used to extract the Embedding features of different granularity, and the Chinese named entities are identified by conditional random fields. Experimental results show that the proposed method can alleviate the deviation of semantic representation and further improve the performance of the existing system. Compared with the traditional method, the value of F is increased by 2.85%.
【作者单位】: 武汉大学计算机学院;
【基金】:国家自然科学基金重点项目(61133012);国家自然科学基金面上项目(61373108)
【分类号】:TP391.1
【参考文献】
相关期刊论文 前7条
1 邱莎;王付艳;申浩如;段玻;阿圆;丁海燕;;基于含边界词性特征的中文命名实体识别[J];计算机工程;2012年13期
2 彭春艳;张晖;包玲玉;陈昌平;;基于条件随机域的生物命名实体识别[J];计算机工程;2009年22期
3 冯元勇;孙乐;张大鲲;李文波;;基于小规模尾字特征的中文命名实体识别研究[J];电子学报;2008年09期
4 张sソ,
本文编号:2233712
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/2233712.html