当前位置:主页 > 社科论文 > 社会学论文 >

切片逆回归降维模型扩展及其应用

发布时间:2018-03-22 02:13

  本文选题:降维模型 切入点:切片逆回归 出处:《贵州财经大学》2014年硕士论文 论文类型:学位论文


【摘要】:随着计算机技术的发展,处理海量数据,尤其是对含有大量变量的数据集的处理越来越普及。面对多变量数据,如何能够在尽量没有信息缺损的情况下,对大规模数据降维,从而提取有用信息,已成为统计学家们面临的重大课题。而充分降维正是解决高维数据降维的非常重要及有效的工具,基本思想是,在尽量不损失信息的前提下,对高维自变量进行低维投影,从而达到降维的目的。 充分降维理论的一个重要内容是估计中心降维空间的基向量从而确定其基方向。估计中心降维空间基向量一个重要的方法是切片逆回归方法,它主要根据条件一阶矩来识别中心降维空间,但是切片逆回归方法有局限性,当回归函数对称时,切片逆回归方法失效,本文主要对切片逆回归降维模型方法进行扩展,从而解决回归函数对称时,切片逆回归方法失效的问题。 在回归函数对称时,提出两种改进降维模型的方法,一种是在切片逆回归方法的基础上,对自变量进行二维扩展,对多元回归的方向进行有效的估计,然后用此方向作为自变量的权重修正,解决一阶矩方差的估计有效性问题,并证明此权重修正可有效识别中心降维空间。第二种是在外积梯度方法的基础上,对自变量进行二维扩展,对多元回归的方向进行有效的估计,从而估计梯度的方向,解决回归函数对称时,切片逆回归方法失效问题。 通过模拟可得到,在一般简单回归函数情况下,新提出的两种扩展降维方法和其他降维方法一样好,同时在回归函数对称时,新提出的两种扩展降维方法都可以有效地减少自变量的维数。在文章的最后,将这两种方法,应用到两个数据集实例,通过与其他几种降维方法进行对比,得出这两种降维方法的有效性,同时得到在实际降维过程中降维模型方法的选择过程,当自变量之间相关性很强时,几种降维方法均较好,为简便和因子的便于解释,可以使用主成分分析方法,当自变量之间的相关性很小时,可以考虑切片逆回归和新提出的两种方法,,当切片逆回归方法失效时,可应用新提出的两种降维方法。
[Abstract]:With the development of computer technology, the processing of massive data, especially the data set containing a large number of variables, is becoming more and more popular. In the face of multi-variable data, how to reduce the dimension of large-scale data without information defect as far as possible, Therefore, extracting useful information has become an important task for statisticians. Full dimensionality reduction is a very important and effective tool to solve the problem of dimensionality reduction of high-dimensional data. The basic idea is that, without losing information as much as possible, The low dimensional projection of high dimensional independent variables is carried out to achieve the goal of dimensionality reduction. One of the important contents of the theory of sufficient dimension reduction is to estimate the basis vector of the center reduced dimension space and determine its fundamental direction. One important method of estimating the center reduced dimension space basis vector is the slice inverse regression method. It is mainly based on the conditional first order moment to identify the center dimension reduction space, but the slice inverse regression method has limitations. When the regression function is symmetric, the slice inverse regression method is invalid. This paper mainly extends the slice inverse regression reduction model method. In order to solve the problem that slice inverse regression method fails when the regression function is symmetric. When the regression function is symmetric, two improved dimensionality reduction models are proposed. One is to extend the independent variables in two dimensions on the basis of slice inverse regression, and to estimate the direction of multivariate regression effectively. Then this direction is used as the weight correction of independent variables to solve the problem of estimating the first-order moment variance, and it is proved that the weight correction can effectively identify the center dimensionality reduction space. The second method is based on the outer product gradient method. The independent variables are expanded to estimate the direction of multivariate regression, and the direction of gradient is estimated. When the regression function is symmetric, the method of slice inverse regression is invalid. The simulation results show that the two new extended dimensionality reduction methods are as good as other dimensionality reduction methods in the case of general simple regression functions, and at the same time, when the regression functions are symmetric, Two new extended dimensionality reduction methods can effectively reduce the dimension of independent variables. At the end of this paper, the two methods are applied to two examples of data sets and compared with other dimensionality reduction methods. The validity of these two dimensionality reduction methods and the selection process of dimensionality reduction model methods in the actual dimensionality reduction process are obtained. When the correlation between independent variables is very strong, several dimensionality reduction methods are better, which is simple and convenient to explain the factors. The principal component analysis method can be used. When the correlation between independent variables is very small, we can consider the slice inverse regression and the newly proposed two methods. When the slice inverse regression method fails, we can apply the two new dimension reduction methods.
【学位授予单位】:贵州财经大学
【学位级别】:硕士
【学位授予年份】:2014
【分类号】:C815

【参考文献】

相关期刊论文 前3条

1 赵俊龙;徐兴忠;;基于加权方差估计的降维[J];中国科学(A辑:数学);2008年09期

2 ;Dimension reduction based on weighted variance estimate[J];Science in China(Series A:Mathematics);2009年03期

3 付凌晖,王惠文;多项式回归的建模方法比较研究[J];数理统计与管理;2004年01期



本文编号:1646600

资料下载
论文发表

本文链接:https://www.wllwen.com/shekelunwen/shgj/1646600.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户13a80***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com