基于平行语料的查询翻译词典改进方法研究
发布时间:2018-10-09 16:43
【摘要】:面对基于双语词典的跨语言检索查询翻译方法中固有的一对多等翻译模糊问题,已有研究成果存在对于非组合型复合词无法进行准确翻译、双语词典和其他翻译资源联合使用引入较大计算开销等弊端。为建立英汉双向跨语言检索实用性系统,在现有的一部包含若干科技词汇和短语的双语科技词典的基础上,着重研究如何引入平行语料来改进已有的双语词典问题。目标是生成一部基于句对齐平行语料的科技类双语概率词典,为跨语言检索查询翻译消歧提供实时性支持。
[Abstract]:In the face of the inherent ambiguity of one-to-many translation in cross-language retrieval query translation based on bilingual dictionaries, there is a lack of accurate translation of non-combinatorial compound words. The combined use of bilingual dictionaries and other translation resources leads to large computational overhead. In order to establish a practical system of bi-directional cross-language retrieval between English and Chinese, this paper focuses on the introduction of parallel corpus to improve the existing bilingual dictionaries on the basis of the existing bilingual scientific and technological dictionaries containing a number of scientific and technological words and phrases. The aim is to generate a scientific and technological bilingual probability dictionary based on sentence alignment and parallel corpus, which can provide real-time support for cross-language retrieval query translation disambiguation.
【作者单位】: 中国科学技术信息研究所;
【基金】:中国博士后科学基金项目“基于查询分类的跨语言检索查询翻译消歧技术研究”(项目编号:20090450465) 中国科学技术信息研究所2010学科建设项目“自然语言处理”(项目编号:XK2010-6)研究成果之一
【分类号】:G354;H087
本文编号:2260043
[Abstract]:In the face of the inherent ambiguity of one-to-many translation in cross-language retrieval query translation based on bilingual dictionaries, there is a lack of accurate translation of non-combinatorial compound words. The combined use of bilingual dictionaries and other translation resources leads to large computational overhead. In order to establish a practical system of bi-directional cross-language retrieval between English and Chinese, this paper focuses on the introduction of parallel corpus to improve the existing bilingual dictionaries on the basis of the existing bilingual scientific and technological dictionaries containing a number of scientific and technological words and phrases. The aim is to generate a scientific and technological bilingual probability dictionary based on sentence alignment and parallel corpus, which can provide real-time support for cross-language retrieval query translation disambiguation.
【作者单位】: 中国科学技术信息研究所;
【基金】:中国博士后科学基金项目“基于查询分类的跨语言检索查询翻译消歧技术研究”(项目编号:20090450465) 中国科学技术信息研究所2010学科建设项目“自然语言处理”(项目编号:XK2010-6)研究成果之一
【分类号】:G354;H087
【二级参考文献】
相关期刊论文 前2条
1 郭宇锋;黄敏;;跨语言信息检索理论与应用研究[J];图书与情报;2006年02期
2 任成梅;;跨语言信息检索的发展与展望[J];图书馆学研究;2006年04期
【相似文献】
相关会议论文 前1条
1 葛运东;孙常龙;房璐;姚建民;;基于搜索引擎的专有名称译文挖掘研究[A];中国计算机语言学研究前沿进展(2007-2009)[C];2009年
相关硕士学位论文 前1条
1 杨飞;网络传播视野下的中文在线词典研究[D];安徽大学;2010年
,本文编号:2260043
本文链接:https://www.wllwen.com/wenyilunwen/hanyulw/2260043.html