银行信用评级中的不平衡分类问题研究
[Abstract]:Credit rating is an important part of bank credit risk management. It is a method for banks to evaluate customers' credit status, repay loan ability and future prospects. It is a process of guiding business by mining customer information. Under the background of the current big data era, the bank can obtain more and more customer credit data. How to find out the customer credit grade by mining the hidden information is the most important problem that the bank faces. In the actual bank credit data set, the customers with good credit are often much more than those with bad credit, which leads to the problem of bank credit rating is essentially an unbalanced classification problem. In the problem of unbalanced classification, small samples are often the focus of attention, such as credit rating field, banks pay more attention to those customers with poor credit. Therefore, how to effectively distinguish and identify small samples is the key to solve the problem of unbalanced classification. Machine learning algorithms often can not effectively identify small class samples when dealing with unbalanced classification problems, so how to effectively solve the unbalanced classification problem is the focus of research work. At present, the unbalanced classification problem is mainly studied from the data level and the algorithm level. In data level, resampling method is mainly used to balance the distribution of data categories, such as random under-sampling method, rose method and SMOTE method, which are typical resampling methods, and ensemble learning algorithms are often used to solve the problem of unbalanced classification. In order to verify the validity of resampling method and ensemble learning algorithm in dealing with the problem of unbalanced classification, four groups of data sets with different unbalance rates from UCI database and KEEL database are used for simulation experiments. The experimental results show that the resampling method and the ensemble learning algorithm can effectively improve the recognition rate of the classification model for small class samples. Rose method is an artificial synthetic data method. After the weight coefficient is improved and combined with the random under-sampling method, the RHS random Hybrid Sampling) method is obtained, and then the classical AdaBoost algorithm is used as the ensemble learning algorithm, thus the RHSBoost (Random Hybrid Sampling Boosting) algorithm is obtained. The basic idea of the algorithm is: firstly, the balanced data set can be obtained by random under-sampling method, and then more artificial data can be synthesized by the improved ROSE method, and the weight of subclass samples can be changed by using the improved ROSE method. In this way, we can enhance the classifier. In this paper, the bank credit data set is used to experiment. On the premise of using the decision tree as the basic classification algorithm, the RHSBoost algorithm is compared with the RUSBoost algorithm, the resampling method and the ensemble learning algorithm. The feasibility and advantages of the RHSBoost algorithm are proved.
【学位授予单位】:广东工业大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP311.13;F830.4;TP181
【相似文献】
相关期刊论文 前10条
1 CF·赵宝良;浅说“BPH—DC”论[J];发明与革新;2001年04期
2 王胜祥;现实、实践与理论——兼谈图书馆高位理论[J];黑龙江图书馆;1990年02期
3 王健庭;火信号的采集与相关修正[J];数据采集与处理;1987年02期
4 陈国阶;我国东西部发展不平衡与西部开发[J];科技导报;1995年07期
5 王萌;施艳艳;王海明;沈明辉;;不平衡电网电压下双馈风力发电系统强励控制[J];测控技术;2014年07期
6 漫征;;克服地区落后论的错误思想[J];新闻战线;1960年11期
7 ;来稿选题建议[J];青年研究;1999年01期
8 沈睿;;区域发展不平衡——不同地域中小企业信息化建设差距较大[J];每周电脑报;2004年08期
9 张昕竹;用电信普遍服务政策改善经济发展不平衡[J];通信世界;2001年16期
10 周耘;;试论我国年鉴发展的不平衡性[J];图书馆学研究;1987年04期
相关会议论文 前5条
1 张雨石;唐丽敏;王庸凯;陈文科;;关于中日航线集装箱运量不平衡原因的分析[A];中国航海学会——2004年度学术交流会优秀论文集[C];2004年
2 廖芳宇;;基于LabVIEW的三相不平衡的测量[A];2011年云南电力技术论坛论文集(入选部分)[C];2011年
3 沙鹏程;;关于西部民营企业可持续发展的思考[A];第十四次全国回族学研讨会论文汇编[C];2003年
4 张敦伟;丁博;;配电网三相不平衡补偿的探讨[A];2007中国电机工程学会电力系统自动化专委会供用电管理自动化学科组(分专委会)二届三次会议论文集[C];2007年
5 王仲生;王翔;;转子不平衡自愈监控系统设计[A];第七届全国信息获取与处理学术会议论文集[C];2009年
相关重要报纸文章 前10条
1 本报记者 刘金松;教育最大的不公平是教育资源不平衡[N];经济观察报;2014年
2 程凯;解决不平衡还要靠市场[N];中华工商时报;2005年
3 本报见习记者 周宁;示范小城镇建设“四个不平衡”[N];经济信息时报;2013年
4 记者 张黎明;我市治堵工作进展不平衡[N];金华日报;2014年
5 本报记者 任s,
本文编号:2168719
本文链接:https://www.wllwen.com/jingjilunwen/huobiyinxinglunwen/2168719.html