当前位置:主页 > 科技论文 > 搜索引擎论文 >

机票价格预测技术的研究与实现

发布时间:2018-05-19 13:44

  本文选题:机票价格预测 + KNN算法 ; 参考:《中国民航大学》2013年硕士论文


【摘要】:随着民航事业的快速发展,越来越多的旅客将航空运输作为远程出行的首选。而网络技术的飞速发展以及电子客票全面推广使用,各大航空公司都已利用各自的网站开始销售电子客票,人们可以从Internet上快速便捷的获取机票价格信息。面对频繁变化的机票价格,人们渴望知道机票的变化规律及何时购买机票最划算。本文基于国内航线机票数据利用数据挖掘的算法建立模型,旨在给旅客提供按出行日期预测的机票价格及购买机票的建议。本文以国内某一航班为研究对象,从数据挖掘的角度进行探究。主要研究工作如下:一、机票数据采集,通过利用垂直搜索引擎HERTRIX工具获取网站的机票价格,利用HTMLParser工具实现机票价格数据的在线获取;二、简述机票数据分析和预处理过程,将抓取到的数据进行预处理,统一标准化格式,存入数据库,并分析机票各个属性与价格的关系;三、在详细研究KNN、Q学习和加权移动平均时间序列分析算法基本原理的基础上,改进了Q学习和时间序列算法,首先KNN算法用于训练购买决策分类器,给用户一个购买建议;其次通过改进Q学习算法建立机票价格预测模型,运用历史数据不断训练Q矩阵,呈现给用户预测价格;最后运用改进的加权移动平均时间序列分析法建立机票预测模型,该模型分为小于一个星期和大于一个星期两种情况,根据预测时间与当前时间的时间差给用户呈现预测价格;四、主观Bayes算法的集成学习模型,利用Bayes推理技术将三种机票价格预测模型的预测结果进行融合,得到集成的机票预测价格和最终的购买建议。将上述数据获取技术、价格预测技术和集成算法结合,本文设计了机票价格预测原型系统。本文使用已抓取的深圳至北京的航班号为CA1304的9336条航班机票数据,分别用KNN算法、Q学习算法、时间序列算法和主观Bayes集成算法进行预测。通过模拟实验,主观Bayes集成算法很好的实现了节省开支,其效果优于其他三种算法。
[Abstract]:With the rapid development of civil aviation, more and more passengers take air transportation as the first choice for long-distance travel. With the rapid development of network technology and the comprehensive promotion and use of electronic ticketing, all major airlines have started to sell e-tickets using their own websites, and people can quickly and conveniently obtain ticket price information from Internet. In the face of frequent changes in ticket prices, people are eager to know the rules of change and when to buy the most cost-effective ticket. Based on domestic airline ticket data, this paper establishes a model by using data mining algorithm, which aims to provide passengers with the advice of ticket price forecast according to travel date and purchase of ticket. This article takes a domestic flight as the research object, carries on the research from the data mining angle. The main research work is as follows: first, air ticket data collection, through the use of vertical search engine HERTRIX tool to obtain the price of tickets, using HTMLParser tools to achieve online access to ticket price data; second, briefly air ticket data analysis and preprocessing process, The captured data is preprocessed, standardized format is unified, stored in database, and the relationship between each attribute of ticket and price is analyzed. Thirdly, on the basis of studying the basic principle of KNNQ learning and weighted moving average time series analysis algorithm in detail, Q learning and time series algorithm are improved. First, KNN algorithm is used to train the purchase decision classifier to give the user a purchase suggestion. Secondly, through the improved Q learning algorithm, the ticket price prediction model is established, and the Q matrix is trained with historical data. Finally, an improved weighted moving average time series analysis method is used to establish a ticket prediction model, which can be divided into two categories: less than one week and more than one week. According to the time difference between the forecast time and the current time, the forecast price is presented to the user. Fourthly, the integrated learning model of subjective Bayes algorithm is used to fuse the forecast results of the three ticket price prediction models by using the Bayes reasoning technology. Get integrated ticket forecast prices and final purchase advice. Combining the above data acquisition technology, price forecasting technology and integrated algorithm, a prototype system of ticket price prediction is designed in this paper. In this paper, the data of 9336 flights from Shenzhen to Beijing whose flight number is CA1304 are used to predict, respectively, using KNN algorithm Q learning algorithm, time series algorithm and subjective Bayes integration algorithm. Through simulation experiments, the subjective Bayes ensemble algorithm achieves good cost saving, and its effect is better than the other three algorithms.
【学位授予单位】:中国民航大学
【学位级别】:硕士
【学位授予年份】:2013
【分类号】:F562.5;TP311.13

【参考文献】

相关期刊论文 前10条

1 陈昊;李兵;;基于云模型的不确定性推理方法[J];小型微型计算机系统;2011年12期

2 陈小玉;;基于专家系统的不确定性推理机的研究与实现[J];制造业自动化;2011年18期

3 张睿涵;林振荣;李建民;衷湾;;基于主题定制的专利网络爬虫的设计与实现[J];计算机与现代化;2011年07期

4 谷萧君;;基于改进KNN算法的价格预测模型研究[J];电脑知识与技术;2010年33期

5 冯克鹏;;KNN价格预测模型的研究与改进[J];软件导刊;2010年10期

6 秦鹏;缪柏其;;基于广义指数预报因子的石油价格预测模型[J];系统工程理论与实践;2010年08期

7 周芳;;基于KNN-ANN算法的边际电价预测[J];计算机工程;2010年11期

8 施飞;陈森发;;随时间变化的机票折扣定价研究[J];交通运输系统工程与信息;2010年01期

9 郑长松;傅彦;佘莉;;基于模板的Web信息自动提取方法[J];计算机应用研究;2009年02期

10 王静;姚勇;刘志镜;;基于广义隐马尔可夫模型的网页信息抽取方法[J];山东大学学报(理学版);2007年11期

相关博士学位论文 前1条

1 安爽;稳健模糊粗糙集模型研究[D];哈尔滨工业大学;2011年

相关硕士学位论文 前4条

1 滕文达;基于移动平台股票资讯搜索与预测系统研究[D];哈尔滨理工大学;2011年

2 秦鹏;基于广义指数预报因子的石油价格预测模型[D];中国科学技术大学;2010年

3 胡浩民;基于RBF神经网络并行学习模型的数据分类及预测研究[D];上海师范大学;2003年

4 荣腾中;基于混沌理论的时间序列分析[D];西南交通大学;2003年



本文编号:1910332

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/sousuoyinqinglunwen/1910332.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户681b4***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com