广义泊松Hurdle回归模型及其统计诊断
发布时间:2018-08-02 19:14
【摘要】:计数数据是一种常见的离散型数据,在我们日常生活中的众多领域都存在着大量的计数数据.处理计数数据的一个最基本的模型是泊松回归模型,但是他的局限性在于均值必须等于方差,这在实际中是很难满足的.而广义泊松分布是标准泊松分布的自然推广,它引入了散度参数,能够用来权衡均值和方差的关系.另一方面,有些计数数据还会出现大量的零数据,这些数据中的零的个数要明显多于泊松分布、广义泊松分布产生零的个数,我们称这些数据为含零过多的数据.本文主要研究的对象就是这类含零过多并且期望不等于方差的特殊数据,详细介绍了处理这类数据的一种典型模型——广义泊松Hurdle回归模型.具体研究内容如下.本文第一章介绍了研究的背景.第二章介绍了广义泊松回归模型、Hurdle回归模型和广义泊松Hurdle回归模型,并且给出了广义泊松Hurdle回归模型的参数估计方法.第三章给出了基于数据删除模型的统计诊断量,给出了参数估计的一步近似公式、广义Cook距离和似然距离,并且对散度参数的存在性进行检验.第四章给出了模型的选择方法,通过这些准则来判断哪个模型的拟合效果更好.第五章用Monte Carlo随机模拟方法来说明第二章第三章所介绍的统计量的有效性.第六章通过耳病发生次数的例子来说明对于这种含零过多并且期望不等于方差的数据用本文重点介绍的广义泊松Hurdle回归模型拟合效果最好.论文最后给出结论和进一步研究的问题.
[Abstract]:Counting data is a kind of common discrete data. There are a lot of counting data in many fields of our daily life. One of the most basic models for dealing with counting data is the Poisson regression model, but its limitation is that the mean value must be equal to the variance, which is difficult to satisfy in practice. The generalized Poisson distribution is a natural generalization of the standard Poisson distribution. It introduces divergence parameters and can be used to weigh the relationship between mean and variance. On the other hand, a large number of zero data will appear in some counting data, the number of zeros in these data is obviously more than that in Poisson distribution, and the generalized Poisson distribution produces the number of zero. The object of this paper is this kind of special data with zero excess and expected variance. A typical model for dealing with this kind of data, generalized Poisson Hurdle regression model, is introduced in detail. The specific contents of the study are as follows. The first chapter introduces the background of the research. In chapter 2, the Hurdle regression model and the generalized Poisson Hurdle regression model are introduced, and the parameter estimation method of the generalized Poisson Hurdle regression model is given. In chapter 3, the statistical diagnostics based on data deletion model are given. The one-step approximation formula, generalized Cook distance and likelihood distance for parameter estimation are given, and the existence of divergence parameters is tested. In chapter 4, the method of model selection is given, which is used to judge which model is better. Chapter 5 uses Monte Carlo stochastic simulation method to illustrate the validity of the statistics introduced in Chapter 2 and Chapter 3. Chapter 6 shows that the generalized Poisson Hurdle regression model is the best fitting method for the data with zero excess and expected variance through an example of the occurrences of ear diseases. Finally, the conclusion and further research are given.
【学位授予单位】:华中师范大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:O212.1
本文编号:2160458
[Abstract]:Counting data is a kind of common discrete data. There are a lot of counting data in many fields of our daily life. One of the most basic models for dealing with counting data is the Poisson regression model, but its limitation is that the mean value must be equal to the variance, which is difficult to satisfy in practice. The generalized Poisson distribution is a natural generalization of the standard Poisson distribution. It introduces divergence parameters and can be used to weigh the relationship between mean and variance. On the other hand, a large number of zero data will appear in some counting data, the number of zeros in these data is obviously more than that in Poisson distribution, and the generalized Poisson distribution produces the number of zero. The object of this paper is this kind of special data with zero excess and expected variance. A typical model for dealing with this kind of data, generalized Poisson Hurdle regression model, is introduced in detail. The specific contents of the study are as follows. The first chapter introduces the background of the research. In chapter 2, the Hurdle regression model and the generalized Poisson Hurdle regression model are introduced, and the parameter estimation method of the generalized Poisson Hurdle regression model is given. In chapter 3, the statistical diagnostics based on data deletion model are given. The one-step approximation formula, generalized Cook distance and likelihood distance for parameter estimation are given, and the existence of divergence parameters is tested. In chapter 4, the method of model selection is given, which is used to judge which model is better. Chapter 5 uses Monte Carlo stochastic simulation method to illustrate the validity of the statistics introduced in Chapter 2 and Chapter 3. Chapter 6 shows that the generalized Poisson Hurdle regression model is the best fitting method for the data with zero excess and expected variance through an example of the occurrences of ear diseases. Finally, the conclusion and further research are given.
【学位授予单位】:华中师范大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:O212.1
【参考文献】
相关期刊论文 前3条
1 戴林送;林金官;;广义泊松回归模型的统计诊断[J];统计与决策;2013年21期
2 徐昕;郭念国;;Hurdle模型在非寿险分类费率厘定中的应用[J];统计与决策;2012年09期
3 曾平;赵晋芳;刘桂芬;;居民就诊次数的Hurdle模型分析[J];中国卫生统计;2010年06期
相关硕士学位论文 前1条
1 原静;Hurdle计数模型及其医学应用[D];山西医科大学;2010年
,本文编号:2160458
本文链接:https://www.wllwen.com/kejilunwen/yysx/2160458.html