有限混合模型中的若干问题研究

发布时间:2018-08-18 17:27
【摘要】:在分析包含两个或更多子总体的数据时,有限混合模型具有很大的灵活性和便利性.自然地,有限混合模型被广泛应用于许多领域中,比如天文学,医学,遗传学,工程学,社会科学等等.本文的研究背景多是来自医学和遗传学,都和混合模型密切相关,主要内容为以下三个部分.第一部分中,我们研究两样本的同质性检验问题,这两组样本中有一组样本可能来自混合分布.在这个问题下,我们假设核函数为一般的位置-尺度分布,核函数尺度参数可以不同.混合分布的出现和尺度参数的不同导致似然函数无界以及Fisher信息可能关于混合比例无穷大,这些特点使得检验两样本的同质性非常具有挑战性.为此,我们构造惩罚的似然函数并基于该惩罚的似然函数构造EM检验统计量,以同时检验均值信息和方差信息.另外,我们详细研究了 EM检验统计量在原假设和局部备择假设下的极限分布,并讨论了样本量的推算.模拟结果和实例分析表明,所提出的EM检验比现有方法更高效且适用性强.本部分内容在一般的位置-尺度混合分布下推广并丰富了 EM检验在两样本问题中的应用.数量性状位点(Quantitative trait locus,QTL)区间检测中往往涉及到混合模型.第二部分中,在核函数为位置-尺度分布下,我们研究似然比检验在QTL区间检测中的应用,这部分内容分别在两个遗传学情形下研究:减数分裂中,同源染色体的非姐妹染色单体之间不存在双重交叉和存在双重交叉,这两种情形对应的内容分别在第三章和第四章.在第三章中,我们推导出两种情形下极大似然估计和似然比统计量的大样本性质,情形(1)为核函数的位置参数和尺度参数可能都不同;情形(2)为核函数的位置参数可能不同但尺度参数相同且未知.这两种情形下,我们证明似然比的极限分布分别是卡方过程χ22(θ)和χ12(θ)的上确界.根据这个结果,我们并不能很容易地计算似然比检验的临界值.因此,我们进一步给出两种情形下似然比极限分布的显式形式,显式形式的出现使得我们能够很容易的找到临界值,从而极大的减弱了 QTL区间检测中找寻临界值的难度.另外,我们也对局部备择假设下似然比检验统计量的极限分布做了研究.通过数值模拟,我们研究了似然比检验统计量的有限样本性质并和现有方法做了比较.模拟结果侧面验证了我们在似然比检验下推导出的优良性质.最后,我们用似然比检验分析了一个实际问题,分析结果表明,似然比检验可适用性强.第四章中,我们假设同源染色体的非姐妹染色单体之间存在双重交叉,其他假设均和第三章相同.由于双重交叉的存在,统计模型和第三章中的不再相同.我们继续考虑第三章中的两种情形:情形(1)和情形(2).在情形(2)下,构造似然比的检验过程和第三章中的类似,且大样本性质也类似.值得注意的是,在情形(1)下,我们并不能直接基于似然函数构造似然比检验.因为在本章的统计模型下,似然函数是无界的,这导致我们无法得到相合的极大似然估计.为此,我们对尺度参数添加惩罚函数从而构造惩罚的似然函数,进而得到相合的惩罚极大似然估计.最终基于惩罚的似然函数,我们构造似然比检验并建立相应的大样本性质.类似于第三章,我们进一步研究了两情形下极限分布的显式形式和局部备择假设下似然比检验统计量的极限分布.最后,分别通过数值模拟和实例分析,我们研究了似然比检验统计量的有限样本性质并和现有方法做了比较.第三部分中,针对带有结构参数的有限位置-尺度混合模型,我们研究了极大似然估计的强相合性问题.在对参数空间不作任何限制的情形下,我们给出强相合性结果和详细的证明.另外,我们给出了一些例子:核函数分别为正态分布,逻辑分布,极值分布和t分布的有限混合模型,并证明这些模型满足假设条件.
[Abstract]:Naturally, finite mixing models are widely used in many fields, such as astronomy, medicine, genetics, engineering, social sciences, etc. The research background of this paper is mostly from medicine and genetics, and both of them are mixed models. In the first part, we study the homogeneity test of two samples, one of which may come from a mixed distribution. In this case, we assume that the kernel function is a general position-scale distribution, and the scale parameters of the kernel function can be different. The difference between the likelihood function and the scale parameter leads to the unbounded likelihood function and the infinite mixed proportion of Fisher information. These characteristics make it very challenging to test the homogeneity of two samples. In addition, we study the limit distribution of EM test statistics under the original hypothesis and the local alternative hypothesis in detail, and discuss the estimation of sample size. Simulation results and case studies show that the proposed EM test is more efficient and applicable than the existing methods. Quantitative trait locus (QTL) interval detection often involves mixed models. In the second part, we study the application of likelihood ratio test in QTL interval detection under the location-scale distribution of kernel function, which is in two genetic cases respectively. In the third chapter, we derive the large sample properties of maximum likelihood estimators and likelihood ratio statistics in two cases, case (1) is a nuclear functor. In both cases, we prove that the limit distribution of likelihood ratio is the upper bound of chi-square process_22 (theta) and_12 (theta), respectively. According to this result, we can not easily calculate the likelihood ratio. Therefore, we further give the explicit form of likelihood ratio limit distribution in two cases. The appearance of the explicit form makes it easy to find the critical value, thus greatly reducing the difficulty of finding the critical value in QTL interval detection. Likelihood ratio test statistics are studied by numerical simulation and compared with the existing methods. The simulation results verify the good properties derived from the likelihood ratio test. Finally, a practical problem is analyzed by likelihood ratio test. In Chapter 4, we assume that there is a double crossover between the non-sister chromatids of homologous chromosomes, and the other assumptions are the same as in Chapter 3. Because of the double crossover, the statistical model is no longer the same as in Chapter 3. We continue to consider two cases in Chapter 3: case (1) and case (2). (2) The process of constructing likelihood ratio is similar to that in Chapter 3, and the properties of large samples are similar. It is noteworthy that in the case (1), we can not construct likelihood ratio test directly based on the likelihood function. Because the likelihood function is unbounded in the statistical model of this chapter, we can not get the consistent maximum likelihood estimation. Finally, based on the penalty likelihood function, we construct the likelihood ratio test and establish the corresponding large sample properties. Similar to Chapter 3, we further study the explicit limit distribution in two cases. Limit distributions of likelihood ratio test statistics under formal and local alternative assumptions are studied. Finally, we study the finite sample properties of likelihood ratio test statistics and compare them with existing methods by numerical simulation and case analysis. In the third part, we study the finite position-scale mixed model with structural parameters. The strong consistency problem of maximum likelihood estimators is discussed. In the case of no restriction on parameter space, we give strong consistency results and detailed proof. In addition, we give some examples: the finite mixed models of kernel functions are normal distribution, logical distribution, extremum distribution and t distribution, and prove that these models satisfy the hypothesis. Pieces.
【学位授予单位】:华东师范大学
【学位级别】:博士
【学位授予年份】:2017
【分类号】:O212.1

【相似文献】

相关期刊论文 前2条

1 徐勤丰;俞燕;孙鹏飞;;基于隐变量的有限混合模型对有序数据的Bayes聚类分析[J];应用概率统计;2007年04期

2 ;[J];;年期

相关会议论文 前1条

1 杨志豪;胡光岷;;局部协作的有限混合模型估计及其拓扑识别方法[A];2008年中国西部青年通信学术会议论文集[C];2008年

相关博士学位论文 前2条

1 刘关福;有限混合模型中的若干问题研究[D];华东师范大学;2017年

2 王海贤;有限混合模型、非线性二维主成分分析及其在模式分类中应用[D];安徽大学;2005年

相关硕士学位论文 前6条

1 王旭彬;基于有限混合模型的协同过滤算法研究[D];广东工业大学;2016年

2 王观;基于Hybrid有限混合模型的交通事故严重程度分析[D];北京交通大学;2017年

3 孙兰;有限混合模型及其应用的研究进展[D];东北师范大学;2006年

4 肖维;基于有限混合模型的聚类算法及其应用[D];中北大学;2011年

5 高凯;基于有限混合模型的脑MRI图像分割算法研究[D];南京理工大学;2014年

6 江欢;基于有限混合模型的自动图像标注研究[D];安徽大学;2010年



本文编号:2190166

资料下载
论文发表

本文链接:https://www.wllwen.com/shoufeilunwen/jckxbs/2190166.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户f8b95***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com