基于流形学习的A股上市公司抽样的信用评价
发布时间:2018-01-04 20:43
本文关键词:基于流形学习的A股上市公司抽样的信用评价 出处:《电子科技大学》2014年硕士论文 论文类型:学位论文
更多相关文章: 流形学习 等距映射(ISOMAP) 支持向量机 聚类分析 信用评价
【摘要】:随着科学技术的飞速发展,经济全球化的快速蔓延,如何进行有效的信用风险评估是当今金融领域的重要问题。准确的风险评估在银行贷款中尤为重要,甚至对预测违约概率一个小的改进都可以使银行获得更多的额外利润。然而,在银行大量保有的客户数据库中,银行的工作人员难以对这些数据进行有效的分析与利用。而数据挖掘技术对于寻求银行现有业务数据中的规律,开发银行决策支持系统正好提供了有力的支持。面临大量的数据、较高的维度,为了保障数据挖掘的高效性,我们需在原始数据输入之前进行特殊处理以保证数据挖掘算法的良好性能。而流形学习作为一种降维的机器学习方法,正好可以满足降维这一需求。鉴于此,本文提出了一个基于流形学习和数据挖掘技术的混合模型来进行信用评价研究。本研究提出的基于流形学习的信用评价模型如下:(1)对抽样选取的250家A股上市公司过去的非线性财务数据进行Z-score规范化数据预处理。(2)使用流形学习典型算法中的等距映射(ISOMAP)对财务数据进行降维,即特征提取。(3)将提取的特征数据输入SVM进行分类和预测企业信用风险。为了证明本文提出模型的有效性,我们将“PCA+SVM”、“LLE+SVM”,“SVM”的性能与本文提出的混合模型“ISOMAP+SVM”做出比较。(4)在分类的基础上进行聚类,得出具体上市公司分类并划分信用等级以帮助银行制定相应的贷款策略。本文将定性分析和定量分析相结合,采用Matlab R2012a对财务数据进行处理后,得到以下几个重要结论:(1)经过Z-score规范化方法进行数据预处理得到的结果明显优于没有规范化得到的结果。数据是否进行规范化预处理对后续数据处理影响很大。(2)与“PCA+SVM”和“LLE+SVM”相比,本研究所提出的基于流形学习算法中的ISOMAP的信用评价模型不仅有最好的分类精度,使第二类错误的发生率最低,并且与聚类分析相结合提高了分类准确性。此模型能够实现一种改进的预测精度,提高了上市公司的信用分类准确性。(3)在数据降维后,基于二分类的基础上使用k-means算法将250家上市公司成功分类并聚类成了7类,这有助于对上市企业信用风险的评价、划分信用等级并制定相应的信贷策略。(4)使用流形学习和PCA对非线性数据进行降维,均可以提高预测和聚类的准确度,降低信用分类成本。但ISOMAP和LLE对非线性数据的降维性能比PCA略胜一筹。
[Abstract]:With the rapid development of science and technology and the rapid spread of economic globalization, how to carry out effective credit risk assessment is an important issue in the field of finance. Even a small improvement in predicting the probability of default can allow banks to earn more extra profit. However, in a large number of customer databases maintained by banks. It is difficult for the staff of the bank to analyze and utilize these data effectively, and the data mining technology can seek the rules in the existing business data of the bank. The decision support system of the development bank has provided the powerful support. Facing the massive data, the higher dimension, in order to guarantee the high efficiency of the data mining. We need special processing before the original data input to ensure the good performance of the data mining algorithm. As a dimensionality reduction machine learning method, manifold learning can meet the demand of dimensionality reduction. In this paper, a hybrid model based on manifold learning and data mining is proposed to study credit evaluation. Perform Z-score normalization data preprocessing on the past nonlinear financial data of 250 A-share listed companies selected from a sample.) using the isometric mapping in the typical manifold learning algorithm (. ISO MAP) reduces the dimension of financial data. In order to prove the validity of the model proposed in this paper, we will "PCA SVM". The performance of "LLE SVM" and "SVM" is compared with the hybrid model "ISOMAP SVM" proposed in this paper. The classification of specific listed companies and the classification of credit ratings to help banks to formulate the corresponding loan strategy. This paper combines qualitative analysis and quantitative analysis. Use Matlab R2012a to process financial data. Get the following important conclusions: 1). The result of data preprocessing by Z-score normalization method is obviously better than that without normalization. Whether or not the data is normalized preprocessing has a great influence on the subsequent data processing. Compared with "PCA SVM" and "LLE SVM". The credit evaluation model based on ISOMAP in manifold learning algorithm proposed in this paper not only has the best classification accuracy, but also has the lowest occurrence rate of the second kind of errors. And combined with clustering analysis to improve the accuracy of classification. This model can achieve an improved prediction accuracy, improve the accuracy of credit classification of listed companies. On the basis of two-classification, we use k-means algorithm to classify 250 listed companies successfully and cluster them into 7 categories, which is helpful to evaluate the credit risk of listed enterprises. Using manifold learning and PCA to reduce the dimension of nonlinear data can improve the accuracy of prediction and clustering. The cost of credit classification is reduced. But ISOMAP and LLE have better dimensionality reduction performance than PCA.
【学位授予单位】:电子科技大学
【学位级别】:硕士
【学位授予年份】:2014
【分类号】:F832.51;F275;F832.4
【参考文献】
相关期刊论文 前1条
1 刘东辉;卞建鹏;付平;刘智青;;支持向量机最优参数选择的研究[J];河北科技大学学报;2009年01期
,本文编号:1380050
本文链接:https://www.wllwen.com/jingjilunwen/jinrongzhengquanlunwen/1380050.html