个性化搜索引擎及其关键技术研究
发布时间:2018-06-06 19:40
本文选题:搜索引擎 + 个性化搜索 ; 参考:《江南大学》2012年硕士论文
【摘要】:随着Web信息的急速增长,搜索引擎已成为用户信息检索的主要工具。但通用搜索引擎针对不同用户的查询请求都提供相同的检索结果,体现不出用户的个性化需求。同时,信息检索领域存在检索词简短和歧义的问题,这在很大程度上影响了检索的效率。因此,针对用户输入的查询请求,如果搜索引擎系统能自动提供一个相关的查询列表,将有助于用户进行查询修正,进而检索到用户所需要的信息。本文从个性化搜索引擎的关键技术出发,详细讨论了个性化查询推荐技术,对于用户的查询请求,提供查询推荐;同时通过获取不同用户的查询偏好,以达到个性化查询推荐的目的。 为此,本文研究和探索了信息检索和个性化搜索引擎的相关技术,并进行了仿真实验,主要内容包括以下几个方面: 首先,提出了一种基于概念抽取的个性化聚类算法,通过获取不同用户的偏好,以达到提供个性化查询推荐的目的。实验结果表明,该方法具有较高的准确率,为用户提供潜在感兴趣的资源,显著提高个性化检索系统的效率。 其次,通过对用户点击的URL进行分析,发现URL中出现的许多标记都是有意义的,尤其是那些高质量的网页。因此,这些URL中的标记可以作为该网页的主要内容或者它所表达主题的扼要说明。本文对URL进行分析处理,提出了一种基于TF-IQF模型和图聚类的个性化查询建议方法。 最后,对用户查询词进行分析,用以下三种方法来表示用户查询词:(1)点击的文档;(2)相似查询词;(3)反向查询词。每种表示方法都对应着一种语义相似度计算方法,通过设定不同方法以不同的权重,得到新的查询词语义相似度计算方法,然后使用基于用户偏好的用户-查询语义聚类方法对用户查询词进行聚类,从而达到个性化查询推荐的目的。
[Abstract]:With the rapid growth of Web information, search engine has become the main tool of user information retrieval. But the general search engine provides the same retrieval results for different users, which can not reflect the personalized needs of users. At the same time, there are some problems in the field of information retrieval, such as brevity and ambiguity, which greatly affect the efficiency of retrieval. Therefore, if the search engine system can automatically provide a related query list for the query request input by the user, it will be helpful for the user to correct the query and retrieve the information needed by the user. Starting from the key technology of personalized search engine, this paper discusses in detail the personalized query recommendation technology, which provides query recommendation for users' query requests, and obtains different users' query preferences. In order to achieve the purpose of personalized query recommendation, this paper studies and explores the relevant technologies of information retrieval and personalized search engine, and carries out simulation experiments. The main contents include the following aspects: first, This paper presents a personalized clustering algorithm based on concept extraction, which obtains the preferences of different users in order to provide personalized query recommendation. The experimental results show that the method has high accuracy, provides users with potentially interesting resources, and significantly improves the efficiency of the personalized retrieval system. Secondly, the URL clicked by the user is analyzed. Find that many of the tags that appear in URLs are meaningful, especially high-quality web pages. Therefore, tags in these URLs can be used as the main content of the page or as a brief description of the subject it expresses. In this paper, the URL is analyzed and processed, and a personalized query suggestion method based on TF-IQF model and graph clustering is proposed. The following three methods are used to express the user query word: 1) the click of the document / 2) the similar query word / 3) the reverse query word / _ _ _ Each representation method corresponds to a semantic similarity calculation method. By setting different methods with different weights, a new method for calculating semantic similarity of query words is obtained. Then the user-query semantic clustering method based on user preference is used to cluster the user query words to achieve the purpose of personalized query recommendation.
【学位授予单位】:江南大学
【学位级别】:硕士
【学位授予年份】:2012
【分类号】:TP391.3
【参考文献】
相关期刊论文 前8条
1 路海明,卢增祥,李衍达;基于多Agent混合智能实现个性化信息推荐[J];高技术通讯;2001年04期
2 王继民,陈,
本文编号:1987880
本文链接:https://www.wllwen.com/kejilunwen/sousuoyinqinglunwen/1987880.html