酒店搜索推荐的设计与分析
发布时间:2018-06-19 01:38
本文选题:推荐系统 + 酒店搜索 ; 参考:《华中科技大学》2013年硕士论文
【摘要】:随着信息技术和互联网的发展,人们从信息匮乏时代来到了信息过载时代,用户很难从海量的信息中快速获得对自己有用的信息,对信息的利用率反而下降了。因此过滤信息的能力成为了衡量一个信息系统好坏的重要指标。一个具好的信息系统,会从海量信息中过滤出用户最关注的信息,这将大大增加系统工作的效率,并节省用户寻找信息的时间。推荐系统正是在这种背景下应运而生,,作为传统搜索引擎的一个补充,在解决信息过载问题中发挥着重要的作用。 以某旅游垂直搜索网站为实例展开面向酒店搜索的推荐技术研究。在深入分析了各种常用推荐系统后,结合酒店搜索的特点,设计了一种基于酒店相似度的酒店推荐系统。系统的设计思路是根据用户最近的访问酒店推测出用户的兴趣,然后推荐相似的酒店。系统包括离线模块和线上模块,离线模块根据点击日志和酒店信息计算酒店相似性表,线上模块根据用户的最近访问历史计算出推荐结果并负责收集用户反馈和记录系统状态。为了对系统进行离线评测和研究,同时设计了一种基于用户访问时间序列的推荐评测系统,并定义了命中率和命中率精度两个精确度指标作为主要的评测指标。该评测系统把每个用户的点击详情日志看成访问序列,用最近访问历史、当前访问酒店和目标酒店组成的时间窗口在访问序列上滑动来模拟回放用户的访问和推荐过程,并进行相关统计,计算出评测指标。该评测系统被用来研究基于内容、协同过滤等多种相似性算法对系统的影响,并探究影响推荐效果的各种因素和改进系统的方法。 经过研究,发现使用基于协同过滤的Amazon相似性算法和点击详情转化率相似性算法的效果最好,归一化相似性是必要的,应该经常更新酒店相似性表。使用最佳训练集长度、过滤坏数据、组合使用多推荐引擎可以有效改进系统效果。综合使用这些改进方法之后,相对于原始系统,命中率提高了7%,命中率精度提高了15%。
[Abstract]:With the development of information technology and Internet, people come to the age of information overload from the age of lack of information. It is very difficult for users to obtain useful information quickly from the mass of information, but the utilization rate of information has declined. Therefore, the ability to filter information has become an important index to measure the quality of an information system. A good information system will filter out the most concerned information from the mass of information, which will greatly increase the efficiency of the system and save the time for users to find information. Recommendation system emerges as the times require under this background, as a supplement of traditional search engine, it plays an important role in solving the problem of information overload. Taking a vertical search website as an example, the recommendation technology for hotel search is studied. A hotel recommendation system based on hotel similarity is designed based on the analysis of various commonly used recommendation systems and the characteristics of hotel search. The design idea of the system is to speculate the user's interest based on the user's recent visit to the hotel, and then recommend similar hotel. The system includes offline module and online module. The offline module calculates hotel similarity table according to the click log and hotel information. The online module calculates the recommended results according to the user's recent visit history and is responsible for collecting user feedback and recording system status. In order to evaluate and study the system off-line, a recommendation evaluation system based on user access time series is designed, and the accuracy index of hit ratio and hit rate is defined as the main evaluation index. The system regards each user's click details log as an access sequence, and uses the recent access history, the time window composed of the current visiting hotel and the target hotel to slide on the access sequence to simulate the playback user's access and recommendation process. And carries on the correlation statistics, calculates the appraisal index. The evaluation system is used to study the influence of content-based, collaborative filtering and other similarity algorithms on the system, and to explore the factors that affect the effectiveness of recommendation and the methods to improve the system. It is found that the similarity algorithm of Amazon based on collaborative filtering and the similarity algorithm of conversion rate of click details are the best. The normalized similarity is necessary and the hotel similarity table should be updated frequently. Using the best training set length, filtering bad data and combining multiple recommendation engines can effectively improve the system effect. After using these improved methods, the hit ratio and accuracy of the original system are increased by 7% and 15% respectively.
【学位授予单位】:华中科技大学
【学位级别】:硕士
【学位授予年份】:2013
【分类号】:TP391.3
【参考文献】
相关期刊论文 前6条
1 崔林,宋瀚涛,陆玉昌;基于语义相似性的资源协同过滤技术研究[J];北京理工大学学报;2005年05期
2 周军锋,汤显,郭景峰;一种优化的协同过滤推荐算法[J];计算机研究与发展;2004年10期
3 林鸿飞,杨志豪,赵晶;基于内容和合作模式的信息推荐机制[J];中文信息学报;2005年01期
4 邓爱林,朱扬勇,施伯乐;基于项目评分预测的协同过滤推荐算法[J];软件学报;2003年09期
5 陈冬林;聂规划;刘平峰;;基于网页语义相似性的商品隐性评分算法[J];系统工程理论与实践;2006年11期
6 邓爱林,左子叶,朱扬勇;基于项目聚类的协同过滤推荐算法[J];小型微型计算机系统;2004年09期
本文编号:2037758
本文链接:https://www.wllwen.com/kejilunwen/sousuoyinqinglunwen/2037758.html