基于WEB平台的混合搜索的研究与实现

发布时间：2018-04-09 10:28

本文选题：混合搜索　切入点：Struts2　出处：《西安建筑科技大学》2011年硕士论文

【摘要】：随着信息技术的疾速发展，用户通过网络搜索获得快速、全面信息的需求越来越强烈。实现快速的、全面的搜索结果也成为一个研究热点。由于目前主要的搜索引擎返回结果的比率不足40％，而且各个搜索引擎有着不同的机制、算法、范围等，所以导致同样一个查询请求在不同的搜索引擎中查询结果的重复率不足34％。因此，要想获得一个比较全面的、准确的、快速的结果，就必须反复调用多个搜索引擎，这样会影响用户查询效率。为了避免这种反复调用的情况，MySearch项目通过整合网页和购物搜索满足用户的信息需求。本项目是一个基于WEB平台的整合应用，实现了网页搜索和购物搜索两大功能，其中网页整合了谷歌、百度、搜搜和有道，购物整合了淘宝和有道。具体来讲，通过搜索条件构造原网站的URL，然后对原网站进行网页抓取和数据项的抽取，最后对数据进行去重、缓存，并显示给用户，让用户在一个网页上可以很迅速的查找多个网站的信息。该项目以Struts2作为WEB框架，使用Ibatis作为数据访问层组件，，应用EHcache作为分布式缓存框架，实现快速搜索。论文首先介绍了混合搜索的研究背景、国内外的发展现状以及项目来源和研究意义；其次，简要介绍了混合搜索的概念和优势、混合搜索的相关技术；重点分析了这些技术在项目中的应用；接下来详细地介绍了混合搜索的总体设计，包括需求分析、数据库设计、项目的总体结构和MVC设计模式的应用；最后，介绍了项目开发过程和实现细节；以及介绍项目进行测试的方法及结果。本项目最终实现了用户在网页搜索框中输入关键词时，就可以获得来自谷歌、百度、搜搜和有道主流搜索引擎中不重复的、全面的、快速的、准确的信息，并以整体统一的格式呈现给用户。同样，购物也实现了预设的功能。项目中还存在着一定的不足，在今后的开发和维护中，将会不断改进页面的友好性，热门分类搜索的实现，及广告模块的管理等等。
[Abstract]:With the rapid development of information technology, users need more and more comprehensive information through network search.Achieving fast and comprehensive search results has also become a research hotspot.At present, the main search engines return less than 40% of the results, and each search engine has different mechanisms, algorithms, scope and so on, so the repeat rate of the same query request in different search engines is less than 34%.Therefore, in order to obtain a more comprehensive, accurate and fast result, multiple search engines must be repeatedly called, which will affect the efficiency of user query.To avoid this kind of repeated calls, the MySearch project meets users' information needs by integrating web pages and shopping searches.This project is an integrated application based on WEB platform, which realizes two functions of web search and shopping search, among which web pages integrate Google, Baidu, Sosou and youdao, and shopping integrates Taobao and youdao.Specifically, the URLL of the original website is constructed by searching conditions, then the original website is fetched and the data items are extracted. Finally, the data is removed, cached, and displayed to the user.Allows users to quickly find information on multiple sites on a single page.This project uses Struts2 as WEB framework, Ibatis as data access layer component and EHcache as distributed cache framework to realize fast search.Firstly, this paper introduces the research background of hybrid search, the development status at home and abroad, the source of the project and the significance of the research; secondly, it briefly introduces the concept and advantages of hybrid search, and the related technologies of hybrid search.The application of these technologies in the project is analyzed in detail. Then, the general design of hybrid search is introduced in detail, including requirement analysis, database design, the overall structure of the project and the application of MVC design pattern.This paper introduces the development process and implementation details of the project, as well as the methods and results of the project testing.This project has finally realized that when users enter keywords in the web search box, they can obtain non-repetitive, comprehensive, fast and accurate information from Google, Baidu, Soso and youdao mainstream search engines.And the overall unified format presented to the user.Similarly, shopping also achieves the preset function.There are still some deficiencies in the project, in the future development and maintenance, will continue to improve the friendliness of the page, the implementation of popular classification search, and advertising module management and so on.
【学位授予单位】：西安建筑科技大学
【学位级别】：硕士
【学位授予年份】：2011
【分类号】：TP311.52

【参考文献】