当前位置:主页 > 科技论文 > 搜索引擎论文 >

基于推荐技术的个性化搜索引擎方案的设计与实现

发布时间:2018-05-27 09:19

  本文选题:搜索引擎 + 数据挖掘 ; 参考:《中国地质大学(北京)》2012年硕士论文


【摘要】:随着互联网信息的爆炸性增长,搜索引擎用户对信息获取的质量提出了更高要求。为了帮助用户更快更好的找到所需,搜索引擎需要深入分析用户行为数据,挖掘行为模式,改善检索相关性。本文研究内容源于某公司核心部门一项目小组,该项目组致力于挖掘用户行为数据,以提升用户的搜索体验。 本文通过数据挖掘技术,在海量的用户行为数据中挖掘有用的用户行为模式,借助于全文检索引擎Lucene,设计并实现了个性化搜索,并与未实现个性化搜索的系统作对比,结果表明个性化搜索给出的结果更能满足用户需求。 为达成目标,本文首先深入分析信息检索的相关理论,完整描述了搜索引擎各模块组成及其功能,着重指出了搜索引擎测评的重要意义;并详细叙述了数据挖掘的基础理论,以及建立在其之上的推荐技术的基本工作原理。 其次,本文从Query个性化、排序个性化以及产品个性化三个维度对个性化搜索的需求做了深入探讨,并构建了个性化搜索的模型以及评估体系,对个性化搜索的潜在风险亦作了简要分析。在这些工作的基础上,提出了实现个性化搜索的总体规划。 再次,为了表明用户行为数据可用于个性化搜索,本文从基础数据的角度出发,提出了五个基本假设,并从统计学的角度充分论证了用户行为数据对对个性化搜索的理论支持。为了保存海量的用户行为数据,本文还设计了数据仓库系统,以支撑后端的推荐技术系统。 最后,本文提出三种实现个性化搜索的详细方案以及流程图,并对核心的推荐系统以及线下挖掘模块给出了详细架构:第一种方案通过修改相关性排序算法,以加入个性化因子;第二种方案不需要修改现有搜索引擎的核心算法,仅需要在现有检索结果的基础上进行个性化排序;第三中方案根据用户的个性化需求,对用户检索的Query进行改写,这种方案不需要修改原有排序算法。综合考虑成本以及对现有系统的耦合度,本文抛弃第一种方案,借助于全文检索引擎Lucene的,集成第二、第三种方案,实现了个性化搜索,并通过“个性化环境”和“对比环境”的搜索结果对比,,证实了个性化搜索更能满足用户需求。
[Abstract]:With the explosive growth of Internet information, the users of search engines have put forward higher requirements for the quality of information acquisition. In order to help users to find more quickly and better, search engines need to analyze user behavior data, mining behavior patterns, and improve retrieval relevance. The content of this paper is based on a small project of a company's core department. Group, the project team is committed to mining user behavior data to enhance user search experience.
Through data mining technology, this paper excavate useful user behavior patterns in massive user behavior data, designed and implemented personalized search with the help of full text search engine Lucene, and compared with the system that did not realize personalized search. The results show that the personalized search results can meet the user needs more.
In order to achieve the goal, this paper first deeply analyzes the relevant theory of information retrieval, describes the components and functions of each module of the search engine, points out the significance of the search engine evaluation, and describes the basic theory of data mining and the basic principle of the recommendation technology based on it.
Secondly, this paper makes an in-depth discussion on the requirements of personalized search from three dimensions of Query personalization, sorting, individualization and product individualization, and constructs a personalized search model and evaluation system, and gives a brief analysis of the potential risk of personalized search. On the basis of these work, the general search is put forward. Body planning.
Thirdly, in order to show that user behavior data can be used for personalized search, this paper puts forward five basic hypotheses from the perspective of basic data, and fully demonstrates the theoretical support of user behavior data to personalized search from the statistical point of view. In order to save massive user behavior data, this paper also designs a data warehouse system. A recommended technical system to support the back end.
Finally, this paper puts forward three detailed schemes and flow charts for personalized search, and gives a detailed framework for the core recommendation system and the offline mining module. The first scheme can add personalized factors by modifying the correlation sorting algorithm, and the second schemes need not modify the core algorithms of the existing search engines. It is necessary to make personalized sorting on the basis of the existing retrieval results; thirdly, the third scheme rewrites the user's Query based on the user's personalized requirements. This scheme does not need to modify the original sorting algorithm. The first scheme is abandoned and the full text retrieval engine Luce is abandoned. NE, which integrates second and third schemes, implements personalized search, and compares the search results of "personalized environment" and "contrast environment" to confirm that personalized search can meet the needs of users more.
【学位授予单位】:中国地质大学(北京)
【学位级别】:硕士
【学位授予年份】:2012
【分类号】:TP391.3

【参考文献】

相关博士学位论文 前2条

1 孙小华;协同过滤系统的稀疏性与冷启动问题研究[D];浙江大学;2005年

2 郁雪;基于协同过滤技术的推荐方法研究[D];天津大学;2009年

相关硕士学位论文 前2条

1 何克勤;基于标签的推荐系统模型及算法研究[D];华东师范大学;2011年

2 李慧;基于用户评论信息的商品推荐技术[D];扬州大学;2007年



本文编号:1941423

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/sousuoyinqinglunwen/1941423.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户0bf1e***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com