线性模型与单指标模型的若干研究

发布时间:2018-04-25 17:31

  本文选题:线性模型 + 单指标模型 ; 参考:《重庆大学》2016年博士论文


【摘要】:稳健估计和变量选择是统计建模中非常重要的两个方面。变量选择意味着我们需要寻找真正影响响应变量的那些协变量,从而降低模型复杂度和提高预测精度。同时,我们希望所提估计方法是稳健的特别是当数据存在较多异常值时,从而使得变量选择结果不会受到较大的影响。另一方面,纵向数据在生物医学、经济学、社会学等领域有着广泛的应用,目前已成为统计学研究的热点问题之一。本文基于线性模型、广义线性模型、单指标模型和单指标系数模型研究了稳健估计、变量选择和纵向数据分析。在第二章中,针对参数个数随样本量发散的线性模型,本章基于SCAD惩罚函数和秩回归提出了一种稳健的变量选择方法,该方法能够有效地克服响应变量中异常值或厚尾误差分布的影响。在一些正则条件下,证明了所提估计具有相合性和Oracle性质。进一步,为了克服现有方法的计算困难,本章提出了能够快速求解惩罚秩回归估计的贪婪坐标下降算法。为了处理p(29)n的情形,本章基于距离相关的独立筛选方法提出了两步估计,同时证明了两步估计具有Oracle性质。最后通过数值模拟验证了本章所提方法的稳健性和有效性。在第三章中,针对上一章所考虑的线性模型不能处理离散响应变量,本章将研究纵向广义线性模型的稳健估计与变量选择。具体地,我们结合指数得分函数和权函数构造了稳健且有效的估计方程,该估计方程能够同时克服响应变量和协变量中异常值的影响。为了避免解凸优化问题,本章构建了稳健且有效的光滑阈广义估计方程同时实现参数估计与变量选择。在一些正则条件下,证明了所提估计具有相合性和Oracle性质。进一步,通过影响函数证明了所提估计是稳健的。最后,运用数值模拟以及实例分析验证了所提估计的有限样本性质。在第四章中,我们研究了纵向单指标模型的估计问题。首先,通过忽略重复测量的组内相关性获得指标系数向量和非参连接函数的初始估计。其次,为了避免广义估计方程中工作相关系数矩阵的估计,本章基于修正的Choleksy分解将协方差矩阵分解为自回归系数和更新方差,然后通过回归建模的方式获得它们的估计。再次,利用剖面加权最小二乘方法构建了指标系数向量和非参连接函数更有效的两步估计。在一些正则条件下,证明了所提估计的相合性和渐近正态性。最后,数值模拟和实例分析验证了所提方法的优越性。在第五章中,针对单指标系数模型,结合局部线性近似和众数回归提出了稳健且有效的估计方法。在一些正则条件下,建立了所提估计的相合性和渐近正态性。进一步,讨论了最优的理论窗宽以及给出了实际问题中选择窗宽的办法,并从理论上证明了所提估计方法不会损失估计的效率。最后,数值模拟验证了所提估计的稳健性和有效性。在第六章中,我们研究了纵向单指标系数模型的估计问题。由于第五章中非参连接函数的估计涉及“欠光滑”窗宽,从而给实际应用中的窗宽选取带来了挑战。因此,本章提出了中心化的广义估计方程来克服这一问题。为了提高统计推断的效率,本章利用修正的Cholesky分解获得协方差矩阵的估计,进而对指标系数向量构建更有效的中心化广义估计方程。然后利用加权最小二乘获得非参连接函数更有效的估计。在一些正则条件下,建立了所提估计的大样本性质。最后,通过数值模拟和实例分析验证了所提方法的有效性和实用性。
[Abstract]:Robust estimation and variable selection are two important aspects of statistical modeling. Variable selection means that we need to find those covariables that really affect the response variables, thus reducing the complexity of the model and improving the accuracy of the prediction. On the other hand, the longitudinal data is widely used in the fields of biomedicine, economics, sociology and other fields, and it has become one of the hot issues in the research of statistics. This paper studies the robust estimation based on linear model, generalized linear model, single index model and single index coefficient model. In the second chapter, a robust variable selection method is proposed based on SCAD penalty function and rank regression in this chapter. This method can effectively overcome the effect of abnormal value or thick tail error distribution in response variables. Under some regular conditions, the method can effectively overcome the influence of abnormal value or thick tail error distribution in the response variable. In order to overcome the difficulty of computing the existing methods, this chapter proposes a greedy coordinate descent algorithm which can quickly solve the penalty rank regression estimation. In order to deal with the case of P (29) n, this chapter presents a two step estimation based on the distance dependent independent screening method and proves two. The step estimation has Oracle properties. Finally, the robustness and effectiveness of the proposed method in this chapter are verified by numerical simulation. In the third chapter, the linear model considered in the last chapter can not deal with the discrete response variables. In this chapter, we will study the robust estimation and variable selection of the longitudinal generalized linear model. In order to avoid the problem of convex optimization, this chapter constructs a robust and effective smooth threshold generalized estimation equation for the simultaneous realization of parameter estimation and variable selection in order to avoid the problem of convex optimization. Under some regular conditions, it is proved under some regular conditions. The proposed estimates have consistency and Oracle properties. Further, the proposed estimation is robust by the influence function. Finally, the finite sample properties of the proposed estimate are verified by numerical simulation and case analysis. In the fourth chapter, we study the problem of the estimation of the longitudinal single index model. The correlation obtains the initial estimation of the index coefficient vector and the non parametric join function. Secondly, in order to avoid the estimation of the work correlation coefficient matrix in the generalized estimation equation, this chapter decomposes the covariance matrix into the autoregressive coefficient and the updated variance based on the modified Choleksy decomposition, and then obtains their estimation by the regression modeling method. Again, The two step estimation of the index coefficient vector and the non parametric connection function is constructed by using the weighted least square method of the section. Under some regular conditions, the consistency and asymptotic normality of the proposed estimate are proved. Finally, the superiority of the proposed method is verified by numerical simulation and case analysis. In the fifth chapter, the single index coefficient model, A robust and effective estimation method is proposed in combination with local linear approximation and multiple regression. Under some regular conditions, the consistency and asymptotic normality of the proposed estimation are established. Further, the optimal theoretical window width is discussed and the method of selecting window width in practical problems is given. Finally, the numerical simulation proves the robustness and effectiveness of the proposed estimate. In the sixth chapter, we study the estimation of the longitudinal single index coefficient model. Since the estimation of the non parametric connection function in the fifth chapter involves "less smooth" window width, it brings challenges to the selection of the window width in the actual use. Therefore, this chapter In order to improve the efficiency of statistical inference, in order to improve the efficiency of statistical inference, this chapter uses the modified Cholesky decomposition to obtain the estimation of covariance matrix, and then constructs a more effective central generalized estimation equation for the index coefficient vector, and then uses weighted least squares to obtain the non parametric connection function more effectively. In some regular conditions, the large sample properties of the proposed estimate are established. Finally, the effectiveness and practicability of the proposed method are verified by numerical simulation and example analysis.

【学位授予单位】:重庆大学
【学位级别】:博士
【学位授予年份】:2016
【分类号】:O212


本文编号:1802287

资料下载
论文发表

本文链接:https://www.wllwen.com/shoufeilunwen/jckxbs/1802287.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户79670***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com