基于数据挖掘的通信客户流失预警模型研究
本文关键词: 客户流失 数据挖掘 类别均衡化 特征选择 组合预警模型 出处:《华中师范大学》2017年硕士论文 论文类型:学位论文
【摘要】:作为客户关系管理中很重要的一部分,客户流失管理正越来越受到企业的关注和重视。客户流失预警作为一种有效的客户流失管理方法,通过构建预警模型,对潜在流失客户进行预测分析,及时预警并采取相应挽留措施,可以有效减少不必要的客户流失,一定程度上减少企业损失。通信运营企业有数量庞大的客户群,因此拥有丰富的客户数据,同时对客户流失预警管理有强烈的需求。在这样的背景下,本文提出了基于数据挖掘的通信客户流失预警模型研究,结合数据挖掘从海量数据中提取有效信息的能力,通过构建模型对通信客户的潜在流失行为进行预警研究。本文在研读了国内外学者的研究成果之后,对近年来预警模型的构建和数据挖掘算法在模型构建中的应用进行了综述和总结。并对客户流失概念、数据挖掘相关理论和预警模型构建相关技术进行了介绍,奠定本文的理论研究基础。在模型数据准备方面,本文以某市通信运营企业客户数据为实证研究对象,主要从无用特征删除、缺失值填充、数据离散化、非均衡数据均衡化四个方面进行方法探讨和实际操作处理,确保了模型构建的较高的数据质量。在关键特征选择方面,针对通信客户数据的特征维度高的特点,对比分析了卡方检验、主成分分析以及Fisher比率三种常用的关键特征选择方法的效果。对比实验结果发现,基于不同算法的流失预警模型在采用不同的关键特征选择方法时会得到不同的预测效果,相比较而言,Fisher比率筛选更优化特征子集的能力比卡方检验和主成分分析更强,对于基于不同算法的流失预警模型都能得到更好的预测效果。在预警模型构建方面,本文提出构建通信客户流失组合预警模型。相较于一般的组合预警模型,本文加入了基于Fisher比率的特征选择步骤,根据各单项预警模型的最佳特征子集优化训练集。选用C5.0决策树、BP神经网络、支持向量机(SVM)三种数据挖掘算法构建基本通信客户流失预警模型,利用拉格朗日函数求解得到使组合预警与各单项预警偏差最小的最佳组合流失预警模型权重,根据权重线性组合三个基本预警模型的预测结果来构建组合流失预警模型,在此基础上得到通信客户流失组合预警模型的预测结果。实证结果表明,组合流失预警模型比各单项基本流失预警模型预测效果更好,可以一定程度上减少通信运营企业的收入损失。
[Abstract]:As an important part of customer relationship management, customer churn management is being paid more and more attention by enterprises. Forecasting and analysis of potential customers, timely warning and corresponding retention measures can effectively reduce unnecessary customer turnover and reduce enterprise losses to a certain extent. Communication operators have a large number of customers. Therefore, there is abundant customer data, and there is a strong demand for customer churn warning management. Under this background, this paper puts forward the research of communication customer churn warning model based on data mining. Combined with the ability of data mining to extract effective information from massive data, the potential loss behavior of communication customers is studied by constructing a model. This paper summarizes the construction of early warning model and the application of data mining algorithm in model construction in recent years, and introduces the concept of customer churn, the theory of data mining and the related technology of early warning model construction. In the aspect of model data preparation, this paper takes the customer data of a city communication operation enterprise as the empirical research object, mainly removes the useless feature, fills the missing value, and discretizes the data. Four aspects of unbalanced data equalization are discussed and processed in practice, which ensures the high data quality of model construction. In the aspect of key feature selection, aiming at the characteristics of high feature dimension of communication customer data, The effects of three common key feature selection methods, chi-square test, principal component analysis and Fisher ratio, are compared and analyzed. The loss early warning model based on different algorithms will get different prediction results when adopting different key feature selection methods. Compared with Fisher ratio, the ability of selecting more optimized feature subsets is stronger than chi-square test and principal component analysis. For the loss warning model based on different algorithms can get better prediction results. In the early warning model construction, this paper proposes a communication customer churn combination warning model. Compared with the general combination warning model, In this paper, the step of feature selection based on Fisher ratio is added, and the training set is optimized according to the best feature subset of each single early-warning model. The C5.0 decision tree and BP neural network are selected. Three kinds of data mining algorithms of support vector machine (SVM) are used to construct the basic communication customer churn warning model, and the weight of the optimal combination loss warning model is obtained by using Lagrange function to minimize the deviation between the combination early warning and the single item early warning. According to the forecasting results of three basic early-warning models of weighted linear combination, the combined loss early-warning model is constructed, and on this basis, the forecasting results of communication customer churn combination early-warning model are obtained. The empirical results show that, The combined loss early warning model is more effective than the single basic loss warning model and can reduce the loss of revenue to some extent.
【学位授予单位】:华中师范大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP311.13;F626
【参考文献】
相关期刊论文 前10条
1 郑宇晨;吕王勇;;基于logistic模型的证券公司客户流失预警分析[J];郑州航空工业管理学院学报;2016年05期
2 胡世前;姜倩雯;凌冰;尹伟东;;基于改进支持向量机的空气质量监测预警模型[J];江苏大学学报(自然科学版);2016年04期
3 刘佼;袁红平;;基于人工神经网络的房地产市场预警模型研究——以成都市为例[J];工程管理学报;2016年02期
4 张慧;徐勇;;数据挖掘中SVM模型与贝叶斯模型的比较分析——基于电信客户的流失分析[J];平顶山学院学报;2016年02期
5 方匡南;范新妍;马双鸽;;基于网络结构Logistic模型的企业信用风险预警[J];统计研究;2016年04期
6 洪丽平;覃锡忠;贾振红;马军;;基于后验概率支持向量机在客户流失中的预测[J];计算机工程与设计;2016年02期
7 周金治;唐肖芳;;基于相关系数分析的脑电信号特征选择[J];生物医学工程学杂志;2015年04期
8 鲍新中;傅宏宇;;基于变精度加权平均粗糙度决策树的财务预警研究[J];运筹与管理;2015年03期
9 付杰;方芳;严克文;;基于Logistic回归的通信业客户流失预测与挽留研究[J];鄂州大学学报;2015年06期
10 贺本岚;;支持向量机模型在银行客户流失预测中的应用研究[J];金融论坛;2014年09期
相关硕士学位论文 前4条
1 危虎;基于数据挖掘的模具业客户流失分析[D];广东工业大学;2014年
2 王志君;基于神经网络的客户流失预警研究[D];吉林大学;2013年
3 洪金嵩;基于logistic回归的上市公司财务困境预警模型实证研究[D];吉林大学;2010年
4 魏民;基于Logistic回归法的银行风险预警模型构建[D];长沙理工大学;2010年
,本文编号:1554664
本文链接:https://www.wllwen.com/jingjilunwen/xxjj/1554664.html