基于文本聚类搜索引擎查询扩展算法的研究与实现
发布时间:2018-04-24 19:34
本文选题:搜索引擎 + 文本聚类 ; 参考:《北京林业大学》2012年硕士论文
【摘要】:互联网的出现使得信息不断激增,搜索引擎给人们提供了一种从海量信息中定位信息的有效工具。然而信息增长的速度超乎人们的想象,在信息爆炸面前,传统的通用搜索引擎查询方式已不能继续满足人们的需求,如何有效组织浩瀚汪洋中的多样化信息并以合理有效的方式提供给用户是搜索引擎面临的巨大挑战。数据挖掘、模式识别、语义网、本体、查询扩展等技术在搜索引擎领域大显身手,被人们广泛的应用以解决搜索引擎面临的挑战和问题。本文首先介绍了搜索引擎的发展,国内外的研究现状,传统全文检索搜索引擎的基本原理及存在的问题。之后阐述了本文的研究重点查询扩展的发展及趋势。接着从聚类算法选取策略、扩展词选取策略、相似度计算方法等方面详细介绍了本文提出的基于文本聚类搜索引擎的查询扩展算法,该算法结合本文实现的文本聚类搜索引擎系统的实际应用做了一些改进,针对基于文本聚类搜索引擎存在的深入查询问题提供了一种解决方案。然后介绍了本文实现的文本聚类搜索引擎原型系统的模块设计及数据库设计,并通过实验验证了本文提出的查询扩展算法的有效性。
[Abstract]:With the emergence of the Internet, information is proliferating, and search engines provide people with an effective tool to locate information from mass information. However, the speed of information growth is beyond people's imagination. In the face of the information explosion, the traditional general search engine query method can no longer meet the needs of people. How to effectively organize the diversified information in Wang Yang and provide it to users in a reasonable and effective way is a great challenge for search engines. Data mining, pattern recognition, semantic web, ontology, query extension and other technologies have been widely used to solve the challenges and problems faced by search engines. This paper first introduces the development of search engine, the current research situation at home and abroad, the basic principle and existing problems of traditional full-text search engine. After that, the paper expounds the development and trend of the key query extension in this paper. Then, the query extension algorithm based on text clustering search engine is introduced in detail from the selection strategy of clustering algorithm, the strategy of selecting extension words, the method of similarity calculation and so on. This algorithm combines with the practical application of the text clustering search engine system implemented in this paper and provides a solution to the deep query problem in the text clustering search engine. Then the module design and database design of the text clustering search engine prototype system implemented in this paper are introduced, and the validity of the query expansion algorithm proposed in this paper is verified by experiments.
【学位授予单位】:北京林业大学
【学位级别】:硕士
【学位授予年份】:2012
【分类号】:TP391.3
【相似文献】
相关期刊论文 前10条
1 相春雷;;2009年中国搜索引擎市场趋势分析[J];软件世界;2010年02期
2 ;揭秘搜索引擎收录网站的秘密[J];计算机与网络;2010年Z1期
3 苏喻;郑诚;马中杰;;基于语义的VSM模型改进[J];计算机应用与软件;2011年08期
4 马s,
本文编号:1797934
本文链接:https://www.wllwen.com/kejilunwen/sousuoyinqinglunwen/1797934.html