海量数据下的特定语义数据检索优化方法研究
发布时间:2018-08-21 10:25
【摘要】:在对海量数据中特定语义数据进行检索,是数据挖掘的一个重要方向。数据的本体信息与转化的语义信息之间会有一定的误差,使得特定的数据语义本体与信息本身存的关联性在误差和海量数据的双重干扰下,更加退化。传统的检索方法一般采用映射方法,根据词频信息关联性进行信息检索,随着关联性的降低,检索性能下降。提出本体模型分词高斯边缘化融合的特定语义数据检索算法。针对搜索引擎中本体内元素之间的分类关系,把词条数据当作一个时间片来进行分块,对检索的词频进行上下包络分区和高斯边缘化融合,克服中心频率变化对本体之间语义映射和语言知识配对的影响,实现海量数据干扰下特定数据语义检索算法改进。仿真结果表明,改进算法能克服语义词频交叉项干扰,提高数据库数据语义检索精度。
[Abstract]:Retrieval of specific semantic data in massive data is an important direction of data mining. There will be some errors between the ontology information of data and the transformed semantic information, which makes the relationship between specific data semantic ontology and information itself degenerate under the double interference of error and mass data. The traditional retrieval methods usually use mapping method to retrieve information according to the correlation of word frequency information. With the decrease of relevance, the retrieval performance drops. A semantic data retrieval algorithm based on Gao Si marginalization fusion for ontology model partitioning is proposed. Aiming at the classification relationship between the elements in the ontology of search engine, the term data is divided into blocks as a time slice, and the frequency of the retrieval is divided into upper and lower envelopes and Gao Si marginalization. In order to overcome the influence of the change of center frequency on semantic mapping and linguistic knowledge pairing between ontologies, the semantic retrieval algorithm of specific data is improved under the interference of massive data. Simulation results show that the improved algorithm can overcome the interference of semantic word frequency crossover and improve the precision of semantic retrieval of database data.
【作者单位】: 上海师范大学人文与传播学院;广西大学计算机与电子信息学院;
【基金】:广西科学研究与技术开发计划项目(桂科能1140008-3B) 广西自然科学基金项目(2014GXNSFBA118274)
【分类号】:TP391.3
[Abstract]:Retrieval of specific semantic data in massive data is an important direction of data mining. There will be some errors between the ontology information of data and the transformed semantic information, which makes the relationship between specific data semantic ontology and information itself degenerate under the double interference of error and mass data. The traditional retrieval methods usually use mapping method to retrieve information according to the correlation of word frequency information. With the decrease of relevance, the retrieval performance drops. A semantic data retrieval algorithm based on Gao Si marginalization fusion for ontology model partitioning is proposed. Aiming at the classification relationship between the elements in the ontology of search engine, the term data is divided into blocks as a time slice, and the frequency of the retrieval is divided into upper and lower envelopes and Gao Si marginalization. In order to overcome the influence of the change of center frequency on semantic mapping and linguistic knowledge pairing between ontologies, the semantic retrieval algorithm of specific data is improved under the interference of massive data. Simulation results show that the improved algorithm can overcome the interference of semantic word frequency crossover and improve the precision of semantic retrieval of database data.
【作者单位】: 上海师范大学人文与传播学院;广西大学计算机与电子信息学院;
【基金】:广西科学研究与技术开发计划项目(桂科能1140008-3B) 广西自然科学基金项目(2014GXNSFBA118274)
【分类号】:TP391.3
【相似文献】
相关期刊论文 前10条
1 王乐,孙莉;基于服装实例创建企业本体模型的初步研究[J];计算机工程与设计;2005年01期
2 焦宏想;葛世伦;孙清;;制造企业物资管理领域本体模型的构建[J];中国制造业信息化;2006年11期
3 徐珊珊;厉颖;;基于本体的信息本体模型研究[J];煤炭技术;2011年02期
4 唐攀;王红卫;王U,
本文编号:2195408
本文链接:https://www.wllwen.com/kejilunwen/sousuoyinqinglunwen/2195408.html