当前位置:主页 > 科技论文 > 数学论文 >

基于广义线性模型的SLS方法的收缩估计

发布时间:2018-10-31 19:20
【摘要】:随着大数据时代的到来,高维数据大量涌现,给模型变量选择方法带来了挑战,也成为现代统计学研究的热点问题。广义线性模型作为常见的统计模型在实践生活中得到广泛使用,而学者对于其变量选择方法研究和应用较少。因此文中引入一个组合惩罚的变量选择方法——SLS方法,将该方法扩展应用到广义线性模型之中,对Logistic回归模型进行推广,使用Monte Carlo基于三种情况数据模拟:(1)模拟研究变量弱相关情况下SLS方法与MCP方法的优劣;(2)模拟对比高维数据且变量高度相关情况下,SLS方法与MCP方法的优劣;(3)模拟当变量之间存在共线性和高度相关性情况下的变量选择效果,并与Lasso、Adaptive Lasso、Elastic-net、Adaptive Elastic-net结果进行比较分析。通过使用坐标下降算法(CCD)对SLS方法进行计算,并使用5折交叉验证对参数进行选择。结果显示:(1)无论是变量弱相关还是高度相关情况下,SLS方法都能够有效的做出选择,而且效果上相对于MCP都有所改进。(2)变量之间多重共线性和高度相关情况下,Lasso、Adaptive Lasso、Elastic-net、Adaptive Elastic-net以及SLS五种变量选择方法,都能够把共线的变量移除模型,而SLS能够高度相关的变量全部选入到模型之中,在效果上优于其他四种方法。
[Abstract]:With the arrival of big data era, a large number of high-dimensional data emerged, which has brought challenges to the method of model variable selection, and has become a hot issue in modern statistical research. As a common statistical model, generalized linear model has been widely used in practice, but few scholars have studied and applied the method of variable selection. In this paper, a combined penalty variable selection method, SLS method, is introduced in this paper. The method is extended to the generalized linear model, and the Logistic regression model is generalized. Monte Carlo is used to simulate the data in three cases: (1) to study the advantages and disadvantages of SLS method and MCP method under the condition of weak correlation of variables; (2) the advantages and disadvantages of SLS method and MCP method are compared in the case of high dimensional data and high correlation of variables. (3) the effect of variable selection is simulated when there is collinearity and high correlation between variables, and the results are compared with those of Lasso,Adaptive Lasso,Elastic-net,Adaptive Elastic-net. The coordinate descent algorithm (CCD) is used to calculate the SLS method, and 50% discount cross-validation is used to select the parameters. The results show that: (1) when variables are weakly correlated or highly correlated, the SLS method can make choices effectively, and the effect is improved compared with MCP. (2) in the case of multiple collinearity and high correlation between variables, Lasso,Adaptive Lasso,Elastic-net,Adaptive Elastic-net and SLS can remove the collinear variables from the model, while SLS can select all the highly relevant variables into the model, which is better than the other four methods.
【学位授予单位】:暨南大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:O212

【参考文献】

相关期刊论文 前1条

1 蔡鹏;高启兵;;广义线性模型中的变量选择[J];中国科学技术大学学报;2006年09期

相关硕士学位论文 前2条

1 黄登香;Elastic Net方法在几类模型变量选择中的应用[D];广西大学;2014年

2 卢颖;广义线性模型基于Elastic Net的变量选择方法研究[D];北京交通大学;2011年



本文编号:2303293

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/yysx/2303293.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户5f09b***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com