当前位置:主页 > 科技论文 > 搜索引擎论文 >

面向金融投资者及机构的信息咨询引擎系统研究

发布时间:2018-03-26 19:14

  本文选题:搜索引擎 切入点:企业搜索引擎框架 出处:《哈尔滨工业大学》2017年硕士论文


【摘要】:信息咨询引擎系统从类别上属于一种垂直搜索引擎。它是按照一定的搜索策略、运用特定的计算机程序语言,将来自各个国家的金融机构、上市公司和地方政府债券的数据进行整合处理,然后将整合后的数据结果根据搜索关键词展现给特定的用户群体。其用户群体主要是金融机构投资者和个人投资者。所以如何更好地为客户提供个性化检索服务,是实际应用系统需要重点解决的问题。基于此,所研究的基于用户个性化模型排序算法和改进网页权重值排序算法,具有重要的实际意义。本文的主要研究内容如下:首先借助企业搜索引擎框架Solr构建本系统的搜索引擎平台,研究搜索引擎的个性化排序技术,最后对用户浏览网页的行为特征进行分析和提取。将与用户相关性较大的关键词权重值进行计算。进而通过所获得的用户特征向量构建用户个性化模型。并基于此个性化模型对搜索结果进行重排序,从而达到个性化排序的目的。实验结果表明,这种重排序算法可以更好的满足用户的搜索需求,但同时降低了搜索引擎的检索效率。从这个角度出发,本文研究了两种改进的网页权重值算法,以提高个性化排序的效率。首先提出基于个性化网页权重计算的网页权重值算法。该算法利用对用户日志的挖掘分析,从而使网页的网页权重值具有用户个性化特征。其次提出基于事务聚类模式的个性化网页权重值算法。该算法通过获取用户的关键词访问序列,从而得到用户所感兴趣的关键词集合并以此来修正网页权重值,以体现用户个性化特征;进而提出基于主体化事务聚类模式的个性化网页权重值算法,将用户的检索关键词和网页主题进行归纳,使网页的权重值具有用户的个性化偏好。为了验证本文所提算法的有效性,研发了面向金融投资者及机构的信息咨询引擎系统。该系统已成功应用于某司实际业务检索平台。通过QA测试平台实验表明,基于Solr构建的搜索引擎要略优于基于Endeca构建的搜索引擎;基于用户个性化排序算法的检索结果更符合用户的检索需求;改进的网页权重值算法的检索效率明显优于基于用户个性化模型排序算法;同时基于主题化事务聚类模式的个性化网页权重值算法从检索效率上又明显优于基于个性化网页权重计算和事务聚类模式网页权重值算法。
[Abstract]:The information consulting engine system belongs to a vertical search engine in terms of category. It is based on a certain search strategy, using a specific computer programming language, and will come from financial institutions in various countries. The data of listed companies and local government bonds are consolidated and processed. Then the integrated data results are presented to a specific user group according to the search keywords. The user groups are mainly financial institutional investors and individual investors. So how to better provide personalized retrieval services for customers, It is an important problem that needs to be solved in practical application system. Based on this, the sorting algorithm based on user personalization model and the improved ranking algorithm of Web page weight value are studied. The main contents of this paper are as follows: firstly, with the help of the enterprise search engine framework Solr, the platform of the system is constructed, and the personalized ranking technology of the search engine is studied. Finally, we analyze and extract the behavior features of users browsing web pages, calculate the weights of keywords that are highly relevant to users, and then construct a user personalized model based on the obtained user feature vectors. This personalization model reorders search results, Experimental results show that the reordering algorithm can better meet the search needs of users, but at the same time reduce the search efficiency of search engines. In this paper, two improved web page weight algorithms are studied to improve the efficiency of personalized ranking. Firstly, a web page weight algorithm based on personalized web page weight calculation is proposed, which uses mining and analysis of user logs. In order to make the web page weight value have the characteristic of user personalization. Secondly, a personalized web page weight value algorithm based on transaction clustering mode is proposed, which obtains the user's keyword access sequence. In order to get the keyword set of users' interest and modify the Web page weight value to reflect the personalized characteristics of the user, a personalized web page weight value algorithm based on the subject transaction clustering model is proposed. In order to verify the validity of the algorithm proposed in this paper, the user's search keywords and web page topics are summed up to make the weights of the web pages have the users' personalized preferences. An information consulting engine system for financial investors and institutions is developed. The system has been successfully applied to a department's actual business search platform. The search engine based on Solr is a little better than the search engine based on Endeca, and the search result based on personalized sorting algorithm meets the needs of users. The retrieval efficiency of the improved weighted value algorithm is obviously better than that of the ranking algorithm based on the user personalized model. At the same time, the retrieval efficiency of personalized web page weight algorithm based on thematic transaction clustering model is obviously better than that based on personalized web page weight calculation and transaction clustering model.
【学位授予单位】:哈尔滨工业大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP391.3

【参考文献】

相关期刊论文 前5条

1 陈艳秋;孙培立;;一种基于类别强信息特征和贝叶斯算法的中文文本分类器[J];计算机应用与软件;2014年08期

2 吴洁明;冀单单;韩云辉;;基于Web的DCI垂直搜索引擎的研究与设计[J];计算机工程与设计;2013年04期

3 刘徽;黄宽娜;余建桥;;一种Deep Web爬虫爬行策略[J];计算机工程;2012年11期

4 江婕;李建民;曾R挽,

本文编号:1669218


资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/sousuoyinqinglunwen/1669218.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户1c6df***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com