Web搜索的用户兴趣与智能优化研究
发布时间:2018-06-16 17:07
本文选题:Web搜索 + 链接分析 ; 参考:《中南大学》2012年博士论文
【摘要】:随着信息技术的飞速发展,互联网信息量正呈爆炸性增长,万维网己成为一个巨大而复杂的信息空间,人们己从信息缺乏转变为信息过载。互联网信息具有分散、无序、海量等特点,如何从浩瀚的信息资源中快速、有效、准确地找到所需信息是一个具有挑战性的研究课题,Web搜索正成为互联网领域的研究热点和焦点之一。传统的Web搜索算法注重于Web的链接结构和Web页面等级权重,而忽略了用户的兴趣行为,导致了部分搜索结果不完整及准确率低。此外,通过迭代计算出每个网页的Hub值和Authority值的方式,导致Web搜索的效率较低,并容易出现一定的分散和泛化现象。针对传统的Web搜索算法存在的缺点,本文在总结和分析国内外相关研究工作的基础上,充分结合用户的兴趣行为和相关的智能优化算法来展开研究,主要研究内容及创新性工作概括如下: (1)综述了有关搜索引擎结构及其工作流程、传统Web搜索算法设计思路和启发式算法模型的研究成果及方法,为研究Web搜索算法基础理论的研究者提供参考和借鉴。 (2)在分析现有用户兴趣模型表示方式的基础上,针对Web搜索的特点,结合用户浏览行为、用户反馈行为、关键词权重以及短期兴趣和长期兴趣等相关因素,设计了一种基于Web搜索的用户兴趣模型,为后续研究Web环境下的启发式搜索算法奠定基础。 (3)在充分结合遗传量子算法和克隆选择算法优点的基础上提出一种克隆遗传量子搜索算法(Clonal Genetic Quantum Search Algorithm, CGQSA),详细介绍了该算法的设计思路和框架,并运用Markov链理论对其收敛性进行分析。同时,具体分析了该算法的计算复杂度,实验结果表明CGQSA算法具备良好的稳定性和可扩展性,其性能明显优于其它的传统Web搜索算法和启发式算法。 (4)结合关键词的链接权重和Web页的链接结构,设计一种评估Web页平均权重的数学模型,将每个Web页表示成种群中的一个个体,并用一个适应度函数对其性能进行评估。 (5)在遗传算法的基础上,融入模拟退火算法的思想,提出一种遗传模拟退火搜索算法(Genetic Simulated Annealing SearchAlgorithm, GSASA),详细介绍了该算法的设计思路和框架,并对其收敛性进行了具体分析。GSASA算法将遗传算法和模拟退火算法的优点充分结合起来,并充分考虑Web搜索的实际应用环境,在较大程度上提高了算法的运行效率和求解质量。仿真实验取得了较理想的实验结果,从而表明该方法是可行和有效的。 我们所得结果是Web搜索算法理论方面的一些一般性的理论成果,这些成果对于设计与实现Web搜索算法仍然具有指导意义。更重要的是,我们所引入的分析手段与方法对于Web搜索算法的相关理论研究具有较为广泛的适用性和参考价值。
[Abstract]:With the rapid development of information technology, the amount of information on the Internet is increasing explosively. The World wide Web has become a huge and complex information space, people have changed from lack of information to information overload. Internet information has the characteristics of dispersion, disorder, magnanimity and so on. How to get from the vast information resources quickly and effectively, It is a challenging research topic to find the needed information accurately. Web search is becoming one of the research hotspots and focal points in the field of Internet. The traditional Web search algorithm focuses on the link structure of the Web and the weight of the Web page, but neglects the user's interest behavior, which leads to partial incomplete search results and low accuracy. In addition, by iterating out the Hub value and Authority value of each web page, the efficiency of web search is low, and the phenomenon of dispersion and generalization is easy to appear. In view of the shortcomings of the traditional Web search algorithm, based on the summary and analysis of the related research work at home and abroad, this paper fully combines the user's interest behavior and the related intelligent optimization algorithm to carry out the research. The main research contents and innovative work are summarized as follows: (1) the research results and methods of search engine structure and its workflow, traditional Web search algorithm design ideas and heuristic algorithm model are summarized. This paper provides a reference for the researchers who study the basic theory of Web search algorithm. (2) based on the analysis of the existing user interest model, according to the characteristics of Web search, combined with user browsing behavior, user feedback behavior. A Web search based user interest model is designed based on key words weight, short-term interest, long-term interest and other related factors. Based on the advantages of genetic quantum algorithm and clone selection algorithm, a clone genetic quantum search algorithm is proposed. CGQSAA, the design idea and framework of the algorithm are introduced in detail. The convergence is analyzed by Markov chain theory. At the same time, the computational complexity of the algorithm is analyzed in detail. The experimental results show that the CGQSA algorithm has good stability and scalability. Its performance is obviously superior to other traditional Web search algorithms and heuristic algorithms. (4) combining the link weight of keywords and the link structure of Web pages, a mathematical model is designed to evaluate the average weight of Web pages. Each Web page is represented as an individual in the population, and its performance is evaluated by a fitness function. A genetic simulated Annealing search algorithm (GSASAA) is proposed. The design idea and framework of the algorithm are introduced in detail, and the convergence of the algorithm is analyzed in detail, which combines the advantages of genetic algorithm and simulated annealing algorithm. Considering the practical application environment of Web search, the running efficiency and solving quality of the algorithm are improved to a large extent. The simulation results show that the method is feasible and effective. Our results are some general theoretical achievements in the theory of Web search algorithms, which are still of guiding significance for the design and implementation of Web search algorithms. More importantly, the analytical means and methods we introduced have a wide range of applicability and reference value for the theoretical research of Web search algorithms.
【学位授予单位】:中南大学
【学位级别】:博士
【学位授予年份】:2012
【分类号】:TP391.3
【参考文献】
相关期刊论文 前10条
1 郭石军;罗挺;卿太平;;一种新的最短路径启发式搜索算法[J];中国储运;2011年09期
2 谢海涛;孟祥武;;适应用户需求进化的个性化信息服务模型[J];电子学报;2011年03期
3 王立才;孟祥武;张玉洁;;移动网络服务中基于认知心理学的用户偏好提取方法[J];电子学报;2011年11期
4 单蓉;;一种基于用户浏览行为更新的兴趣模型[J];电子设计工程;2010年04期
5 曾长清;王玉v,
本文编号:2027466
本文链接:https://www.wllwen.com/kejilunwen/sousuoyinqinglunwen/2027466.html