基于会话搜索的网页排序算法的研究与设计
发布时间:2018-03-07 22:25
本文选题:会话搜索 切入点:网页检索 出处:《南京大学》2017年硕士论文 论文类型:学位论文
【摘要】:随着互联网技术的迅速发展,互联网上的资源数量越来越多。搜索引擎的发展使得用户可以在庞大的信息资源中找到自己所需要的信息。用户可以在搜索引擎上得到自己感兴趣的信息,影响用户信息检索满意度的就是搜索引擎返回给用户的网页信息,并且影响返回给用户网页信息的核心技术就是搜索引擎的网页排序算法,目前主流的网页排序算法主要是Google的PageRank算法和IBM的HITS算法,但是这些算法的设计思想主要是利用网页之间的链接关系,如果一个网页被其他网页的链接次数比较多,搜索引擎就会认为它的网页质量比较高,从而在排序时将它的位置相对靠前,但是这些算法并不考虑用户与搜索引擎之间的交互问题,所以在网页排序算法的改进上面存在很大的提升空间,现在的研究者对于搜索引擎的研究重点主要就体现在搜索引擎的排序算法上。本文首先介绍了现在搜索引擎中主要的网页排序算法以及MDP模型,随后提出了基于用户会话搜索的QCM网页排序算法,其利用相邻查询之间的句法编辑变化和查询变更之间的关系,以及先前检索的文件来增强会话搜索,并将会话搜索建模为马尔科夫决策过程(MDP),文中会通过实验来验证算法的有效性,最后基于QCM网页排序算法设计了一个信息检索原型系统。本文针对于现有网页排序算法的不足,提出了一种基于用户会话搜索的网页排序算法,该算法更加注重用户与搜索引擎的交互,关注用户进行会话搜索过程中检索词的变化,基于检索词的变化采用MDP模型进行建模,这种网页排序算法取名为QCM,最后通过实验进行算法效率分析,并经过设计实验验证,本文提出的QCM网页排序算法在排序效率上有着较大提高。
[Abstract]:With the rapid development of Internet technology, There are more and more resources on the Internet. With the development of search engine, users can find the information they need in the huge information resources. Users can get the information they are interested in in the search engine. What affects the satisfaction of user information retrieval is the web page information returned by the search engine, and the key technology that affects the web page information return to the user is the search engine's web page sorting algorithm. At present, the main algorithms of web page sorting are Google's PageRank algorithm and IBM's HITS algorithm. However, the design of these algorithms is mainly based on the link relationship between web pages, if a web page is linked more times by other web pages. Search engines tend to think that their web pages are of high quality, so they rank them before they are sorted, but these algorithms don't take into account the interaction between users and search engines. So there is a lot of room for improvement in the sorting algorithm for web pages. The research focus of the present researchers on search engine is mainly reflected in the search engine sorting algorithm. Firstly, this paper introduces the main web page sorting algorithm and MDP model in the current search engine. Then, a QCM web page sorting algorithm based on user session search is proposed, which utilizes the relationship between syntactic editing changes and query changes between adjacent queries, as well as the previously retrieved files to enhance session search. The session search is modeled as Markov decision process, and the validity of the algorithm is verified by experiments. Finally, an information retrieval prototype system is designed based on the QCM web page sorting algorithm. A web page sorting algorithm based on user session search is proposed. The algorithm pays more attention to the interaction between user and search engine, and focuses on the change of search words in the process of user session search. Based on the change of search words, the MDP model is used for modeling. The algorithm is named QCM. at last, the efficiency of the algorithm is analyzed by experiment. The result of the design experiment shows that the sorting efficiency of the QCM page sorting algorithm proposed in this paper has been greatly improved.
【学位授予单位】:南京大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP393.092;TP391.3
,
本文编号:1581243
本文链接:https://www.wllwen.com/guanlilunwen/ydhl/1581243.html