基于网络的跨语言信息检索中OOV译文挖掘研究
发布时间:2018-11-26 13:03
【摘要】:查询翻译是影响跨语言信息检索(CLIR)性能的关键因素之一.查询中未登录词(OOV)译文的挖掘对改进CLIR性能具有重要意义.利用主题词译文查询扩展方法从搜索引擎自动获取有效双语摘要资源;采用频度变化信息和邻接信息从双语摘要资源中抽取多词候选单元,并与常见的基于统计的多词单元抽取方法进行了比较.实验中译文挖掘方法取得了TOP 1包含率62.02%,TOP 10包含率95.35%的效果.
[Abstract]:Query translation is one of the key factors that affect the performance of cross-language information retrieval (CLIR). The mining of (OOV) translation of unlogged words in query is of great significance to improve the performance of CLIR. The efficient bilingual summary resources are automatically obtained from the search engine by using the query expansion method of subject word translation. The frequency variation information and the adjacent information are used to extract multi-word candidate units from bilingual abstract resources, and the results are compared with the common statistical based multi-word cell extraction methods. In the experiment, the TOP 1 inclusion rate of 62.02% and the top 10 inclusion rate of 95.35% are obtained.
【作者单位】: 苏州大学计算机学院;江苏省现代企业信息化应用支撑软件工程技术研究开发中心;
【分类号】:TP391.3
[Abstract]:Query translation is one of the key factors that affect the performance of cross-language information retrieval (CLIR). The mining of (OOV) translation of unlogged words in query is of great significance to improve the performance of CLIR. The efficient bilingual summary resources are automatically obtained from the search engine by using the query expansion method of subject word translation. The frequency variation information and the adjacent information are used to extract multi-word candidate units from bilingual abstract resources, and the results are compared with the common statistical based multi-word cell extraction methods. In the experiment, the TOP 1 inclusion rate of 62.02% and the top 10 inclusion rate of 95.35% are obtained.
【作者单位】: 苏州大学计算机学院;江苏省现代企业信息化应用支撑软件工程技术研究开发中心;
【分类号】:TP391.3
【共引文献】
相关期刊论文 前1条
1 何晓聪;跨语言信息检索初探[J];情报科学;2005年02期
相关硕士学位论文 前2条
1 赵秀文;基于SSH和LDAP的网络安全文件系统的研究[D];清华大学;2005年
2 张东伟;中英文跨语言信息检索模型研究[D];黑龙江大学;2006年
【相似文献】
相关期刊论文 前10条
1 赵小兵;邱莉榕;赵铁军;;多民族语言本体知识库构建技术[J];中文信息学报;2011年04期
2 ;[J];;年期
3 ;[J];;年期
4 ;[J];;年期
5 ;[J];;年期
6 ;[J];;年期
7 ;[J];;年期
8 ;[J];;年期
9 ;[J];;年期
10 ;[J];;年期
相关会议论文 前6条
1 张sソ,
本文编号:2358637
本文链接:https://www.wllwen.com/kejilunwen/sousuoyinqinglunwen/2358637.html