基于用户兴趣模型的个性化搜索算法研究
[Abstract]:With the rapid growth of information on Internet, people have developed a search engine in order to search for information related to themselves, which is a major milestone in the development of query resources. However, with the increasing demand of people, the shortcomings of traditional search engine, such as low retrieval accuracy, repeated pages and so on, are becoming more and more obvious, so that they can not meet the needs of users. In order to better meet the needs of users, individuation, intelligence has become the trend of search engine development. In this paper, the personalization of search engine is deeply studied. The main contents are as follows: firstly, through the study of existing user interest model, a new algorithm for constructing user interest model is proposed. The singular value decomposition (SVD) and k-means clustering algorithm are used to cluster the user's browsing history and its words at different levels, and then two weighted interest trees are created: document class tree and class of speech tree. The weights of each node in the tree represent the degree of interest of the user in this class of documents or words. The experimental results show that the user interest model proposed in this paper has a great improvement in calculating the accuracy of page interest classification. Secondly, aiming at the deficiency of vector space model, an improved method is proposed. In other words, the singular value decomposition (SVD) technique is used to reduce the dimension of the vector space model. The obtained document-class matrix can solve the problems of high dimension, sparsity, synonym and polysemy phenomenon of vector space model. The experimental results show that the improved vector space model is more accurate than the traditional vector space model in calculating page classification. Finally, a new sorting algorithm is proposed to overcome the shortcomings of existing search engine sorting algorithms. On the basis of the user interest model proposed in this paper, the naive Bayesian classifier is used to classify the documents retrieved by the traditional search engine and classify the words, and then the documents are graded according to the classification results. Finally, the document is arranged in descending order according to the document score. The experimental results show that the proposed personalized sorting algorithm is more accurate than the probabilistic model-based personalized search algorithm under the same conditions and can better meet the personalized needs of users.
【学位授予单位】:太原科技大学
【学位级别】:硕士
【学位授予年份】:2013
【分类号】:TP391.3
【参考文献】
相关期刊论文 前7条
1 王继成,潘金贵,张福炎;Web文本挖掘技术研究[J];计算机研究与发展;2000年05期
2 曾春,邢春晓,周立柱;基于内容过滤的个性化搜索算法[J];软件学报;2003年05期
3 苏贵洋,马颖华,李建华;一种基于内容的信息过滤改进模型[J];上海交通大学学报;2004年12期
4 常璐,夏祖奇;搜索引擎的几种常用排序算法[J];图书情报工作;2003年06期
5 李广建,黄];用户模型及其学习方法[J];现代图书情报技术;2002年06期
6 杨思洛;搜索引擎的排序技术研究[J];现代图书情报技术;2005年01期
7 陈彤兵,汪保友,胡金化,施伯乐;一个实时搜索引擎的设计[J];小型微型计算机系统;2004年05期
相关博士学位论文 前1条
1 刘云峰;基于潜在语义分析的中文概念检索研究[D];华中科技大学;2005年
相关硕士学位论文 前10条
1 李彦辉;基于用户兴趣的个性化搜索引擎研究[D];山西财经大学;2011年
2 裴仰军;个性化服务中用户兴趣模型的研究[D];重庆大学;2005年
3 张园园;基于用户兴趣的个性化搜索引擎的分析与研究[D];燕山大学;2006年
4 李爱明;个性化搜索引擎用户模型研究[D];华中师范大学;2007年
5 陈玉娥;个性化服务中用户模型的研究与设计[D];山东科技大学;2007年
6 王礼礼;基于潜在语义索引的文本聚类算法研究[D];西南交通大学;2008年
7 赵权;基于粒度分析原理的模糊聚类算法研究[D];山西大学;2008年
8 时延军;基于Nutch的分布式搜索引擎的设计与研究[D];长春理工大学;2010年
9 张跃火;基于用户兴趣偏好模型的个性化搜索算法[D];重庆大学;2010年
10 贾欣;基于用户兴趣模型的元搜索结果排序算法研究[D];华中科技大学;2012年
本文编号:2466907
本文链接:https://www.wllwen.com/kejilunwen/sousuoyinqinglunwen/2466907.html