RNA/DNA及癌症基因测序数据的统计方法研究
发布时间:2021-10-21 15:53
新一代基因测序技术(Next Generation Sequencing,NGS)的发展,测序成本的降低,大量的测序数据在形形色色的生物实验中产生,也给测序数据的统计分析方法——如何根据这些海量数据,引入统计检验,完成生物实验层面的各种假设,如何用统计的方法弥补基因测序技术在完整揭露生物本质的不足——提出了新的挑战。本文将就RNA测序、DNA甲基化(DNA methylation)以及癌症基因测序数据中统计方法的应用进行研究。·RNA测序首先,NGS一个很重要的应用是快速低消耗地记录所有的基因转录——RNA测序。RNA测序数据,相对于微阵数据,对于转录水平的刻画更加精确。在RNA测序实验中,百万量级的短测序片段被配对到参考基因组(Reference Genome)上,落入某一些基因片段区域的读数被记录下来。这些生物学家们感兴趣的片段一般被成为microRNA(或简称为miRNA)、小干扰RNA (siRNA)、长非编码RNA (lncRNA)或信使RNA (mRNA)。有研究表明,读数数据与目标转录的多少呈线性的关系。产生这些测序数据最基本的一个分析目的在于,更好地识别在不同的生物或者...
【文章来源】:中国科学技术大学安徽省 211工程院校 985工程院校
【文章页数】:105 页
【学位级别】:博士
【文章目录】:
摘要
ABSTRACT
目录
表格
插图
主要符号对照表
Chapter I Introduction
Chapter II Differential Expression Test in RNA-seq Data
2.1 Introduction
2.2 Overview of Existing Normalization Methods
2.2.1 Glob
2.2.2 TMM
2.2.3 Lowess
2.2.4 Quantile
2.2.5 DESeq
2.2.6 edgeR
2.3 Overview of Existing Differential Expression Test Methods
2.3.1 DESeq
2.3.2 edgeR
2.4 Overview of deGPS
2.4.1 GP-MLE2L normalization
2.4.2 GP-Quantile normalization
2.4.3 GP-Theta normalization
2.4.4 GP-MLElL normalization
2.4.5 Differential Expression Test in de GPS
2.5 Simulations and Results
2.5.1 Necessity of data normalization in RNA-seq
2.5.2 Empirical statistical evaluations of different normalization meth-ods
2.5.3 Type Ⅰ errors and statistical powers
2.5.4 Sensitivity and specificity
2.6 Discussion
Chapter III Statistical Methods for Analyzing Base-resolutionMethylation Sequencing Data
3.1 Introduction
3.2 Overview of Generalized Linear Mixed Model
3.3 Different Estimations of GLMM
3.3.1 Pseudo-likelihood Estimation Based on linearisation
3.3.2 Maximum Likelihood Estimation Based on Laplace Approxima-tion
3.3.3 Bayesian Hierarchical GLMM
3.4 Simulation
3.4.1 Simulation for GLIMMIX
3.4.2 Simulation for Bayesian Hierarchical Model
3.5 Discussion
Chapter IV Subclone Detection for Cancer Colls
4.1 Introduction
4.2 Model for Two Subclones
4.2.1 Model Description
4.2.2 Parameter Estimate
4.2.3 The Statistical Significant Test of Two Subclones
4.3 Model for Multiple Sub-clones
4.4 Further Research
参考文献
Appendix A Appendix
致谢
在读期间发表的学术论文与取得的研究成果
本文编号:3449291
【文章来源】:中国科学技术大学安徽省 211工程院校 985工程院校
【文章页数】:105 页
【学位级别】:博士
【文章目录】:
摘要
ABSTRACT
目录
表格
插图
主要符号对照表
Chapter I Introduction
Chapter II Differential Expression Test in RNA-seq Data
2.1 Introduction
2.2 Overview of Existing Normalization Methods
2.2.1 Glob
2.2.2 TMM
2.2.3 Lowess
2.2.4 Quantile
2.2.5 DESeq
2.2.6 edgeR
2.3 Overview of Existing Differential Expression Test Methods
2.3.1 DESeq
2.3.2 edgeR
2.4 Overview of deGPS
2.4.1 GP-MLE2L normalization
2.4.2 GP-Quantile normalization
2.4.3 GP-Theta normalization
2.4.4 GP-MLElL normalization
2.4.5 Differential Expression Test in de GPS
2.5 Simulations and Results
2.5.1 Necessity of data normalization in RNA-seq
2.5.2 Empirical statistical evaluations of different normalization meth-ods
2.5.3 Type Ⅰ errors and statistical powers
2.5.4 Sensitivity and specificity
2.6 Discussion
Chapter III Statistical Methods for Analyzing Base-resolutionMethylation Sequencing Data
3.1 Introduction
3.2 Overview of Generalized Linear Mixed Model
3.3 Different Estimations of GLMM
3.3.1 Pseudo-likelihood Estimation Based on linearisation
3.3.2 Maximum Likelihood Estimation Based on Laplace Approxima-tion
3.3.3 Bayesian Hierarchical GLMM
3.4 Simulation
3.4.1 Simulation for GLIMMIX
3.4.2 Simulation for Bayesian Hierarchical Model
3.5 Discussion
Chapter IV Subclone Detection for Cancer Colls
4.1 Introduction
4.2 Model for Two Subclones
4.2.1 Model Description
4.2.2 Parameter Estimate
4.2.3 The Statistical Significant Test of Two Subclones
4.3 Model for Multiple Sub-clones
4.4 Further Research
参考文献
Appendix A Appendix
致谢
在读期间发表的学术论文与取得的研究成果
本文编号:3449291
本文链接:https://www.wllwen.com/kejilunwen/yysx/3449291.html