基于订票行为的航空旅客划分方法研究
发布时间:2018-06-02 06:48
本文选题:客户细分 + 航空旅客 ; 参考:《江苏科技大学》2015年硕士论文
【摘要】:近年来,随着国内经济的高速发展,民航旅客的数量急剧增加,国内民航进入快速发展模式。各航空公司为了应对民航市场的激烈竞争,分析民航不同旅客群体的出行偏好,进而制定相应的竞争策略成为航空公司的迫切需求。为此,本文以航空旅客购票时记录的客户信息作为数据来源,采用聚类分析的方式,在对客户群体进行有效划分的基础上,分析航空旅客的出行偏好。与传统聚类算法分析的数值类型的数据不同,本文以记录航空客户订票行为的数据作为分析对象,其特殊性在于:首先,源数据为包含数值属性和分类属性的混合类型数据;其次,数据量庞大且分布存储于各航空公司。为此,本文通过改进现有聚类算法的方式使其适合于单一航空公司混合类型数据的聚类分析,从局部的角度分析单一航空公司的旅客出行偏好;进而设计分布式聚类算法,以同时利用不同航空的旅客信息,从全局的角度来分析民航旅客的出行偏好。因此,本文的研究工作主要包括以下两个方面:(1)本文以旅客订票过程中记录的相关信息为基础,将旅客群体划分归结为混合类型数据的聚类问题,采用k-prototypes算法来实现航空旅客群体的有效划分。同时,针对描述旅客购票信息的部分数据属性为离散值且类别众多、语义模糊的不足,借助于民航领域知识对属性数据进行转换表示,简化了属性数据的类别信息,显示表示属性数据中的隐含知识;同时通过构建旅客价值的定量计算模型,有效刻画旅客价值,从而在对航空旅客进行有效划分的基础上分析航空旅客的出行偏好。(2)为了有效处理大规模分布式混合数据集,本文通过扩展k-prototypes算法,以并行方式运行k-prototypes算法,结合领域知识,提出了面向领域的并行k-prototypes算法(Domain based Parallel K-prototypes,DPKP),使得各自航空公司的旅客划分和数据分析在各自站点完成,在提高算法运行效率的同时保护了航空公司的商业隐私。实验结果表明,本文提出的聚类算法适合对航空旅客数据的划分,不仅使得聚类结果的准确性有所提高,而且聚类的时间效率也有提升。最后本文利用国内航空公司提供的旅客数据集,结合本文提出的聚类算法,构建航空旅客细分模型,对旅客进行细分,同时根据细分结果分析不同旅客群体的出行需求,制定相应的营销策略,从而为航空公司提供了很好的战略建议。
[Abstract]:In recent years, with the rapid development of domestic economy, the number of civil aviation passengers has increased sharply, and domestic civil aviation has entered a rapid development mode. In order to cope with the fierce competition in the civil aviation market, the airlines need to analyze the travel preferences of different passenger groups of civil aviation, and then formulate the corresponding competition strategy. Therefore, this paper takes the customer information recorded by airline passengers as the data source, adopts the method of cluster analysis, and analyzes the travel preference of airline passengers on the basis of effectively dividing the customer groups. Different from the data of numerical type analyzed by traditional clustering algorithm, this paper takes the data of booking behavior of aviation customer as the analysis object. The particularity of the data is: firstly, the source data is mixed type data including numerical attribute and classified attribute; Secondly, the amount of data is huge and distributed among airlines. Therefore, this paper improves the existing clustering algorithm to make it suitable for the clustering analysis of mixed type data of single airline, analyzes the passenger travel preference of single airline from a local point of view, and then designs a distributed clustering algorithm. The travel preference of civil aviation passengers is analyzed from a global point of view by using the passenger information of different airlines at the same time. Therefore, the research work of this paper mainly includes the following two aspects: 1) based on the relevant information recorded in the passenger booking process, this paper divides the passenger group into the clustering problem of mixed type data. K-prototypes algorithm is used to realize the effective division of air passenger group. At the same time, in view of the deficiency that some data attributes describing passenger ticket purchase information are discrete values and many categories, and the semantics are fuzzy, the attribute data is transformed and expressed by means of civil aviation knowledge, which simplifies the category information of attribute data. Display the implied knowledge in the attribute data and construct the quantitative calculation model of passenger value to depict the passenger value effectively. In order to deal with large-scale distributed mixed data sets effectively, this paper extends k-prototypes algorithm, runs k-prototypes algorithm in parallel mode, and combines domain knowledge. A domain-oriented parallel k-prototypes algorithm named Domain based Parallel K-prototypes is proposed, which makes passenger partition and data analysis of their respective airlines complete at their respective stations, which improves the efficiency of the algorithm and protects the commercial privacy of airlines. The experimental results show that the proposed clustering algorithm is suitable for the classification of air passenger data, which not only improves the accuracy of the clustering results, but also improves the time efficiency of the clustering. Finally, this paper uses the passenger data set provided by domestic airlines, combined with the clustering algorithm proposed in this paper, to build an air passenger subdivision model to subdivide passengers, and analyze the travel needs of different passenger groups according to the subdivision results. Develop the corresponding marketing strategy, thus providing a good strategic advice for the airline.
【学位授予单位】:江苏科技大学
【学位级别】:硕士
【学位授予年份】:2015
【分类号】:TP311.13
【参考文献】
相关期刊论文 前3条
1 曹国;;基于K-means和PCA的商业银行客户价值细分模型研究[J];财会通讯;2010年27期
2 於跃成;王建东;郑关胜;陈斌;;基于约束信息的并行k-means算法[J];东南大学学报(自然科学版);2011年03期
3 毛典辉;;基于MapReduce的Canopy-Kmeans改进算法[J];计算机工程与应用;2012年27期
相关博士学位论文 前1条
1 朱恒民;领域知识制导的数据挖掘技术及其在中药提取中的应用[D];南京航空航天大学;2006年
相关硕士学位论文 前1条
1 何青松;基于隐私保护的分布式聚类算法的研究[D];复旦大学;2010年
,本文编号:1967816
本文链接:https://www.wllwen.com/guanlilunwen/yingxiaoguanlilunwen/1967816.html