基于贝叶斯方法的小企业信用评分模型研究
发布时间:2018-05-19 13:01
本文选题:拒绝推论 + 贝叶斯界定折叠法 ; 参考:《中南大学》2012年硕士论文
【摘要】:在小企业信用评分过程中经常出现非随机样本选择现象,该现象的产生是由于银行信贷筛选过程中会导致一部分被拒绝企业的违约行为不能被观测。数据缺失和样本选择性偏差可能导致模型参数估计有偏,从而对模型的预测能力产生较大影响。因此如何降低样本有偏问题是研究信用评分模型的重要内容之一。 根据大量文献可以看出,一般的解决方法是拒绝推断技术但大多数效果不甚理想。本文采用Sebasbiani口Ramoni (2000)基于贝叶斯理论提出的界定折叠法(Bound and Collapse, BC法),结合银行信贷筛选过程,构建出一种全新的拒绝推断方法。该方法的原理是不论缺失数据机制如何,都可将缺失数据的参数估计通过某些极端分布限定在一定区间内。区间的上下限由完全集内的数据计算得出。而当缺失数据的机制可知时,区间内的信息将由非响应概率模型计算并最终获得某个单值估计。BC法的第二步是将该区间坍塌成一个对于缺失数据的估计值。通过该方法,包含有样本信息的数据将对缺失数据进行填补,从而获得完整数据样本为小企业信用评分模型的演化做准备。 本文利用2003年美国小企业金融调研数据作为样本,对模型的预测力和效果进行检验与评价。首先,对第一个子样本做logistic回归构建信用评分模型,将该模型运用到第二个子样本,模拟银行信贷筛选,并产生经过信贷筛选后的选择样本。然后,基于有偏样本构建信用评分演化模型以验证其分类能力减弱甚至丧失的假设。最后,利用外部信息和内部信息来估计缺失值,并对选择样本中的每一个缺失值进行填补从而构成完整样本。 利用这种拒绝推断技术的模型将与包含有全部数据的标准模型以及对缺失数据不做任何处理的审查模型进行比较。为了检验模型的鲁棒性,本文设置两种筛选率,不同的筛选率代表着样本数据缺失程度不同和样本选择性偏差不同,并分别采用KS检验、布莱尔评分、ROC曲线三种评估方法对模型进行检验。结果表明,贝叶斯界定折叠法在小企业信用评分演化模型中的应用能有效提高模型分类能力,是在非随机数据缺失机制下解决样本偏差问题的有效途径。
[Abstract]:Non-random sample selection often occurs in the credit scoring process of small enterprises, which is due to the fact that the default behavior of some rejected enterprises can not be observed during the process of bank credit screening. The lack of data and the deviation of sample selectivity may lead to biased estimation of model parameters, which has a great impact on the prediction ability of the model. Therefore, how to reduce sample bias is one of the important contents of credit scoring model. According to a large number of literatures, the general solution is rejection inference, but most of the results are not satisfactory. In this paper, Sebasbiani mouth Ramoni 2000) is used to construct a new method of refusal inference based on Bayesian theory, which is based on the defined folding method and bound and Collapse, BC method, combined with the process of bank credit screening. The principle of this method is that the parameter estimation of missing data can be limited to a certain range by some extreme distributions, regardless of the missing data mechanism. The upper and lower limits of the interval are calculated from the data in the complete set. When the mechanism of missing data is known, the information in the interval will be calculated by the non-response probability model and the second step of the single-valued estimation .BC method is to collapse the interval into an estimate of the missing data. Through this method, the missing data will be filled in by the data containing sample information, and the complete data sample will be obtained to prepare for the evolution of the credit scoring model of small enterprises. In this paper, the prediction force and effect of the model are tested and evaluated by using the financial survey data of American small enterprises in 2003 as a sample. First, the credit scoring model is constructed by logistic regression for the first sub-sample, and the model is applied to the second sub-sample to simulate the bank credit screening, and to produce the selected sample after the credit screening. Then, a credit score evolution model based on biased samples is constructed to verify the assumption that its classification ability is weakened or even lost. Finally, the missing value is estimated by external and internal information, and each missing value in the selected sample is filled to form a complete sample. The model using this rejection inference technique will be compared with the standard model that contains all the data and the review model that does not do any processing on the missing data. In order to test the robustness of the model, two screening rates are set up in this paper. The different screening rates represent different sample data missing degree and sample selectivity deviation, and KS test is used respectively. The model was tested by three evaluation methods: Blair score and ROC curve. The results show that the application of Bayesian defined folding method in the evolution model of small business credit scoring can effectively improve the classification ability of the model and is an effective way to solve the sample deviation problem under the mechanism of non-random data loss.
【学位授予单位】:中南大学
【学位级别】:硕士
【学位授予年份】:2012
【分类号】:F830.5;F224
【参考文献】
相关期刊论文 前10条
1 杨绍基;范闽;;信用评分模型的拒绝偏差与Heckit纠正[J];南方金融;2007年05期
2 程建;连玉君;刘奋军;;信用风险模型的贝叶斯改进研究[J];国际金融研究;2009年01期
3 杨晖;裴曦;;中小企业信用评分的应用研究[J];广西金融研究;2008年03期
4 吴德胜,梁梁;遗传繁衍样本策略及神经网络信用评价研究[J];管理科学;2004年01期
5 肖进;贺昌政;;基于SODM的贝叶斯分类器结构学习及其在客户分类中的应用[J];管理科学;2008年04期
6 邓超;敖宏;胡威;王翔;;基于关系型贷款的大银行对小企业的贷款定价研究[J];经济研究;2010年02期
7 张文君;;小额信用贷款的风险管理:基于贝叶斯均衡的博弈分析[J];福建金融管理干部学院学报;2011年01期
8 王学玲;;贝叶斯网络分类模型研究及其在信用评估中的应用[J];计算机与数字工程;2010年08期
9 李旭升;郭春香;陈凯亚;;最小总风险准则的贝叶斯网络个人信用评估模型[J];计算机应用研究;2009年01期
10 傅鸿源;彭天明;;基于BP神经网络的建筑企业信用评价体系分析[J];科技管理研究;2008年11期
相关博士学位论文 前1条
1 丁东洋;信用风险分析中贝叶斯方法及其应用研究[D];天津财经大学;2009年
,本文编号:1910198
本文链接:https://www.wllwen.com/guanlilunwen/huobilw/1910198.html