银行电话营销成功之路的分析与预测
发布时间:2018-06-20 00:53
本文选题:银行存款 + 电话销售 ; 参考:《华中师范大学》2017年硕士论文
【摘要】:在通信业发达的今天,电话营销的现象早已出现在大街小巷,然而人们对电话营销的接受程度却越来越低,营销结果往往使得营销人员精疲力竭。而本文的研究结果对于商业银行的客户管理,发掘有价值客户,维护客户的忠诚度有重要的理论价值和现实意义。当前随着大数据的突起,使用用数据挖掘技术施行精准营销的领域也越来越多,本文中就提出了利用数据挖掘的方式,以预测出经过电话营销销售银行长期存款的结果,文中收集了国外41188条的银行电话营销数据,分析了与银行客户、产品和社会经济属性相关的150个特征变量,然后通过人为的半自动化选择缩减到21个变量。由于得到的数据集是非平衡数据,只有11.3%条数据是电话销售成功的记录,为了明确非平衡数据集对模型的影响,在对缺失值预处理之后采用了 Chawla提出的SMOTE算法生成了新的平衡数据集,之后比较了利用平衡数据集和非平衡数据集训练模型的效果,发现非平衡数据集得到的模型预测的结果更加偏向于样本中多数的那一类,因此本文使用了平衡数据集进行模型的训练与评估。本文考虑了三个分类模型:Logistic回归模型、决策树和支持向量机,并使用精准度和ROC曲线下AUC的值衡量了分类的效果。其中Logistic回归分类法和决策树拟合模型的解释很容易被人们理解,而且对新的数据还有较好的预测,而支持向量机模型相比较而言则比较复杂,但对线性问题和非线性问题都有较好的学习能力,正是由于这样的复杂性,支持向量机往往能够提供精确的预测,文中经过训练对比确定各模型的参数或结构后,利用测试集数据测得三个模型的精准度分别为47.3%、73.1%和 52.6%,ROC 曲线下 AUC 的值分别为 0.921、0.985 和 0.938。在营销领域,管理者更加希望通过识别具有较高价值的客户,尽量避免在一些低价值的客户身上浪费资源,以此提高投入产出比,那么就希望预测的结果更加准确,而本文中AUC的值相差不大,根据精准度最高的原则,选择决策树C5.0分类算法进行预测。
[Abstract]:In today's developed communications industry, the phenomenon of telephone marketing has already appeared in the streets and lanes. However, the acceptance of telephone marketing is getting lower and lower. The marketing results often make the marketing staff exhausted. The results of this paper are important to the customer management of commercial banks, the valuable customers, and the loyalty of the customers. With the emergence of large data, there are more and more fields of using data mining technology to carry out accurate marketing. In this paper, we put forward the method of using data mining to predict the result of long term deposit through the telemarketing and marketing bank. In this paper, 41188 foreign bank telephone marketing numbers are collected in this paper. According to the analysis, 150 characteristic variables related to bank customers, products and socioeconomic attributes are analyzed and then reduced to 21 variables by human semi automated selection. Since the obtained data sets are non balanced data, only 11.3% data are the records of successful telephone sales. In order to determine the impact of the non balanced dataset on the model, the missing data are missing. After the value preprocessing, the SMOTE algorithm proposed by Chawla is used to generate a new balanced data set. After comparing the effect of using the balanced data set and non balanced data set training model, it is found that the model prediction results from the non balanced dataset are more biased toward the majority of the samples in the sample. Therefore, this paper uses a balanced dataset. This paper considers three classification models: Logistic regression model, decision tree and support vector machine, and uses the value of precision and the value of AUC under the ROC curve to measure the classification effect. The interpretation of the Logistic regression and the decision tree fitting model is easy to be understood by people, and the new data are also better. The support vector machine model is more complex, but it has better learning ability for both linear and nonlinear problems. It is because of the complexity that the support vector machine can often provide accurate prediction. After the training comparison is used to determine the parameters or structures of each model, the test set data is used to measure the data. The accuracy of the three models is 47.3%, 73.1% and 52.6% respectively. The value of AUC under the ROC curve is 0.921,0.985 and 0.938. in the marketing field, and the managers are more hoping to avoid wasting resources on some low value customers by identifying the customers with higher value, so that the input and output ratio can be improved. Then it is hoped to predict the input and output ratio. The result is more accurate, and the value of AUC in this paper is not very different. According to the principle of the highest accuracy, the decision tree C5.0 classification algorithm is selected to predict.
【学位授予单位】:华中师范大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:F274
【参考文献】
相关期刊论文 前10条
1 肖亚明;陈永杰;王玉鹏;刘美娜;;分类变量缺失数据处理方法有效性的比较研究[J];中国卫生统计;2016年02期
2 肖超峰;郭浩明;;基于Logistic回归方法的信用风险预测研究[J];电子技术;2013年09期
3 马莉婷;;数据挖掘技术在客户精细营销预测模型中的应用——以移动通信业务为例[J];闽江学院学报;2013年05期
4 朱明;陶新民;;基于随机下采样和SMOTE的不均衡SVM分类算法[J];信息技术;2012年01期
5 宋建华;;商业银行电话营销研究[J];金融论坛;2011年10期
6 叶军;;电话营销应讲究技巧[J];现代金融;2011年07期
7 王观玉;郭勇;;支持向量机在电信客户流失预测中的应用研究[J];计算机仿真;2011年04期
8 郭静;王永钊;;我国电话营销现状问题与对策[J];产业与科技论坛;2011年05期
9 丁世飞;齐丙娟;谭红艳;;支持向量机理论与算法研究综述[J];电子科技大学学报;2011年01期
10 柯新利;边馥苓;;基于C5.0决策树算法的元胞自动机土地利用变化模拟模型[J];长江流域资源与环境;2010年04期
相关硕士学位论文 前2条
1 黄华;基于神经网络模型的银行客户分类研究[D];安徽工业大学;2014年
2 肖春兰;电话营销在企业中的应用现状及改进路径[D];陕西师范大学;2013年
,本文编号:2042140
本文链接:https://www.wllwen.com/jingjilunwen/xmjj/2042140.html