面向用户偏好的Web搜索排序模型研究
发布时间:2018-02-28 19:37
本文关键词: 信息检索 用户偏好 PageRank算法 搜索引擎 出处:《天津理工大学》2013年硕士论文 论文类型:学位论文
【摘要】:随着互联网技术的发展和日益普及,Web搜索查询为人们提供了丰富的信息,便捷的服务。然而,为了提高服务的质量,如何使互联网返回与当前用户请求更加吻合的结果,成为目前计算机应用技术领域研究的热点问题。在互联网的背景下,Web信息检索系统若能明确当前用户的兴趣偏好以及查询意图,,则其检索出来的结果不仅与用户查询目的的相关性极高,而且由于明确用户的查询主题,对提高搜索引擎的搜索速度有一定提高。为此,本课题针对当前用户的兴趣偏好和查询主题等热点问题进行了研究。 首先,针对传统的Web搜索查询系统存在的主题偏离、概念模糊等问题,提出了基于用户反馈的排序优化方法,采用与用户相关的反馈信息,优化用户的查询关键词,起到了降低查询词潜在歧义性的作用。并以此结果为搜索查询的依据。 其次,针对当前用户对Web搜索查询系统的特定服务需求愈来愈高的问题,提出了基于用户查询意图的搜索排序方法,采用马尔科夫链进行建模,基于随机补足理论将用户偏好及查询意图与现有的搜索排序技术相结合,改善了搜索系统的查询效率,以期满足用户的查询需求。 最后,针对理论知识研究中存在的问题及不足,本文在Lucene开发平台下,搭建了基于Heritrix的搜索系统,同时,基于对网页抓取过程中的网页相关度、概念、链接等关键技术的分析,结合文中提出的相关搜索排序方法,构建了Web搜索系统,以期为用户提供较高质量的信息检索服务,并通过模拟实验,仿真验证提供对实际的搜索系统性能必要的改进。
[Abstract]:With the development of Internet technology and the increasing popularity of Web search queries, people are provided with abundant information and convenient services. However, in order to improve the quality of services, how to make the Internet return results that are more consistent with current user requests, It has become a hot issue in the field of computer application technology. In the background of the Internet, if the Web information retrieval system can make clear the current user's interest preference and query intention, The result is not only highly related to the user's query purpose, but also improves the search speed of the search engine because the user's query subject is clear. In this paper, the current user interest preference and query topics are studied. Firstly, aiming at the problems of topic deviation and fuzzy concept in traditional Web search and query system, a ranking optimization method based on user feedback is proposed, which uses feedback information related to users to optimize the query keywords of users. It can reduce the potential ambiguity of query words, and the result is the basis of search query. Secondly, in order to solve the problem that users need more and more special services in Web search and query system, a search sorting method based on user's query intention is proposed, and Markov chain is used to model the system. Based on the random complement theory, the user preference and query intention are combined with the existing search sorting technology to improve the query efficiency of the search system, in order to meet the query needs of users. Finally, aiming at the problems and shortcomings in the research of theoretical knowledge, this paper builds a search system based on Heritrix under the Lucene development platform, meanwhile, based on the relevance of web pages in the process of web page capture, the concept. Based on the analysis of key technologies such as link and the related search sorting method proposed in this paper, a Web search system is constructed in order to provide users with a high quality information retrieval service. Simulation verification provides necessary improvements to the performance of the actual search system.
【学位授予单位】:天津理工大学
【学位级别】:硕士
【学位授予年份】:2013
【分类号】:TP391.3
【引证文献】
相关博士学位论文 前1条
1 史斌;面向语义网的语义搜索引擎关键技术研究[D];北京工业大学;2010年
本文编号:1548663
本文链接:https://www.wllwen.com/kejilunwen/sousuoyinqinglunwen/1548663.html