当前位置:主页 > 硕博论文 > 医学博士论文 >

小鼠胚胎干细胞高置信度lincRNAs的预测及其调控模式的研究

发布时间:2018-01-16 05:15

  本文关键词:小鼠胚胎干细胞高置信度lincRNAs的预测及其调控模式的研究 出处:《哈尔滨工业大学》2017年博士论文 论文类型:学位论文


  更多相关文章: lincRNA ES细胞 RNA-Seq 互作网络 组蛋白修饰


【摘要】:lincRNAs在新陈代谢、生长发育,以及疾病等方面发挥功能,并在各个层面调控基因表达。作为关键的调控因子,lincRNAs在小鼠ES细胞中发挥重要的调节作用。本课题将利用高通量数据RNA-Seq识别在小鼠ES细胞中表达的未经注释的高置信度lincRNAs转录本,完善lincRNAs的基因组注释。并识别增强子相关lincRNAs与启动子相关lincRNAs的特征调控模式,以及elincRNAs与启动子互作的识别,研究lincRNAs对基因的表达调控作用。本论文整合多套小鼠ES细胞,以及早期胚胎、全胚胎等RNA-Seq数据,识别了6 701个小鼠ES细胞表达的新lincRNAs。RNA-Seq读段的覆盖率和CAGE进行转录本完整性评估的结果表明,基于RNA-Seq识别的新lincRNAs是5′端缺失的不完整的转录本。已知lincRNAs和蛋白质编码转录本的TSS区域的分析结果表明lincRNAs具有特异的基因组与表观基因组特征。预测模型十倍交叉验证和独立的检验集进行评估结果表明,整合基因组与表观基因组特征的lincRNA转录本TSS区域预测模型效能最优。在小鼠全基因组范围内进行lincRNA转录本TSS区域的预测,并修正了1 293个新lincRNAs的TSS区域。利用CAGE以及活性染色质修饰对修正前后的lincRNA转录本TSS区域进行评估,结果表明基于预测的TSS区域在小鼠ES细胞中获得了相对完整的lincRNA转录本。对新lincRNAs进行基因组的分布分析以及基因组与表观基因组表征,新lincRNAs与已知lincRNAs特征相似,具有比蛋白质编码转录本相对少的外显子个数、相对短的转录本长度,以及相对低的保守性等特征,并富集重复元件;并且lincRNAs的表观遗传修饰模型显著地区别于蛋白质编码转录本。利用RT-PCR检测新lincRNAs在不同细胞系和小鼠不同发育阶段的不同组织的表达水平,结果表明新lincRNAs的组织/细胞特异性表达。进一步利用RACE实验对TCONS_00041333转录本全长进行鉴定,结果表明该lincRNAs包含两个转录本,长度分别为656 bp和571 bp。核心启动子元件的结合区域的分析表明,在其TSS上游存在TATA-box、GC-box、CCAAT-box和Initiator的结合区域,并富集H3K4me1和H3K27ac组蛋白修饰。按照染色质状态可以将lincRNAs分为elincRNAs(enhancer associated lincRNAs)和plincRNAs(promoter associated lincRNAs)。基于小鼠ES细胞已知lincRNA转录本TSS区域的H3K4me1/H3K4me3富集比率识别了包含224个elincRNAs与112个plincRNAs的高置信度集合。整合基因组与表观基因组特征,利用正则化的罗杰斯特回归模型识别显著调控elincRNAs与plincRNAs的特征,elincRNAs与TSS区域的DNA甲基化,以及基因体区域的DNA甲基化和H3K122ac的调控相关;plincRNAs与TSS区域的H3K9ac,以及基因体区域的H3K36me3的调控相关。并且基于预测模型识别了3 729个elincRNAs和1 392个plincRNAs。对elincRNAs和plincRNAs进行基因组与表观基因组表征,elincRNAs具有比plincRNAs相对较少的外显子个数、相对短的转录本长度、相对低的表达水平和序列保守性,以及差异的染色质修饰模式等特征。基于组蛋白修饰模式和转录因子富集模式分析小鼠ES细胞elincRNAs与启动子互作的调控模式,结果表明,elincRNAs与启动子间的互作更倾向于受转录因子的调控。并通过小鼠ES细胞elincRNAs与启动子高置信度互作集合的评价表明,基于转录因子斯皮尔曼相关性识别的elincRNAs与启动子互作是最优的预测集合。构建基于elincRNAs与启动子互作高置信度集合的互作网络,以及基于转录因子相关性的互作子网络,网络拓扑特征的分析结果表明,子网络的网络特性与互作网络相似,elincRNAs特异靶向一些启动子,而非广泛地调控。对互作子网络进行模块挖掘以及功能富集分析,一些模块富集在RNA聚合酶Ⅱ结合的转录激活的转录因子的功能,并ES细胞和胚胎发育相关功能。因此,elincRNAs可能参与靶基因转录的激活作用。综上所述,本研究识别一组小鼠ES细胞中表达的转录本边界相对完整的lincRNAs集合,并基于机器学习模型识别elincRNAs与plincRNAs的调控特征,在小鼠ES细胞中识别elincRNAs与其靶向启动子的互作关系。本研究不仅发现并研究小鼠发育过程中重要的lincRNAs,对于系统研究早期胚胎发育lincRNAs对基因表达的调控也具有重要意义。
[Abstract]:LincRNAs on the growth and development, the function of The new supersedes the old., and disease, and expressed at various levels of regulatory genes. As a key regulatory factor, lincRNAs play an important role in mouse ES cells. This paper will use RNA-Seq to identify the high-throughput data expression in mouse ES cells without high confidence lincRNAs transcripts note, complete lincRNAs genome annotation and identification. LincRNAs promoter and enhancer related lincRNAs pattern recognition and regulation, elincRNAs and promoter interactions, the effect of lincRNAs on gene expression regulation. The integration of multiple sets of mouse ES cells, and early embryos, whole embryo RNA-Seq data, identify the new lincRNAs.RNA-Seq reading section 6701 expression of mouse ES cells and the coverage of CAGE transcript integrity assessment results show that the new identification based on RNA-Seq LincRNAs is the 5 'end of the lack of incomplete transcripts. The analysis results of known lincRNAs and protein encoding transcripts in TSS region show that lincRNAs have specific genome and epigenome characteristics. Prediction model of ten fold cross validation and independent test set of evaluation results show that the integration of genome and lincRNA genome transcription characteristics of the table view the TSS prediction model. The optimal efficiency of regional prediction of lincRNA transcripts in TSS region in the mouse genome range, and modified the TSS area 1293 new lincRNAs. Chromatin modification were evaluated before and after the repair of lincRNA transcription is TSS region using CAGE and activity, the results show that the TSS region prediction obtained lincRNA transcription the relative integrity in mouse ES cells. Based on genomic distribution analysis and genome and epigenome characterization of new lincRNAs, new lincRNAs Similar with known lincRNAs features, compared with protein encoding transcripts less exon number, the transcription of relatively short length, and relatively low conservation features, and accumulation of repetitive elements; and epigenetic modifications of lincRNAs model is significantly different from the quality of encoding transcription protein expression by RT-PCR detection. The new lincRNAs in different tissues in different cell lines and mouse at different developmental stages. The results showed that the expression of new lincRNAs cell / tissue specificity. Further experiments using RACE for full-length TCONS_00041333 transcripts were identified, the results show that the lincRNAs contains two transcripts in length respectively combining the analysis of the regional 656 BP and 571 bp. core promoter the element that TATA-box exists in its TSS region upstream GC-box, combined with CCAAT-box and Initiator, and the enrichment of H3K4me1 and H3K27ac in accordance with the staining of histone modification. State lincRNAs can be divided into elincRNAs (enhancer associated lincRNAs) and plincRNAs (promoter associated lincRNAs). Based on the known mouse ES cell lincRNA transcripts in TSS region of H3K4me1/H3K4me3 enrichment ratio identified high reliability set contains 224 elincRNAs and 112 plincRNAs. The integration of genome and epigenome characteristics, characteristics of the regularized Rodgers regression significantly regulation of elincRNAs and plincRNAs model identification, elincRNAs and TSS region of DNA methylation, and the regulation of DNA methylation and genomic regions related to H3K122ac plincRNAs and TSS H3K9ac; region, and the regulation of genomic region of H3K36me3. And the prediction model of the identification of 3729 elincRNAs and 1392 plincRNAs. of the genome for elincRNAs and plincRNAs and the epigenome characterization based on elincRNAs is less than plincRNAs The exon number, a relatively short length of the transcription, expression and sequence conservation is relatively low, and the differences in chromatin modification patterns and other features. The results show that histone modification patterns and transcription factor enrichment pattern analysis control mode, elincRNAs mouse ES cells and the promoter interactions based on interaction control tend to be regulated by the transcription factor elincRNAs and promoter. And through the elincRNAs mouse ES cells and the promoter of high confidence interactions set the evaluation showed that the elincRNAs promoter and the transcription factor Spielman correlation identification interaction is the optimal prediction based on set. To construct the elincRNAs promoter and the interaction of the interaction network reliability based on set, and based on the interaction of sub network of transcription factor correlation analysis results, network topological features show that the network characteristics of sub network similarity and interaction network, elincRNAs specific target To some promoter, rather than widely regulation. Module mining and enrichment analysis of interaction sub networks, some transcription factor module enrichment combined with RNA polymerase II transcription activation function, and ES cells and embryonic development related functions. Therefore, elincRNAs activation may be involved in the transcription of target genes. To sum up, the identification of transcriptional expression of mouse ES cells in a set of relatively complete set the boundary of lincRNAs, and based on regulation characteristics of machine learning model to identify elincRNAs and plincRNAs, in mouse ES cells and identification of elincRNAs targeting interaction promoter. This research not only found lincRNAs and important research in mice during development it is of important significance for regulating system of early embryo development of lincRNAs gene expression.

【学位授予单位】:哈尔滨工业大学
【学位级别】:博士
【学位授予年份】:2017
【分类号】:R3416


本文编号:1431690

资料下载
论文发表

本文链接:https://www.wllwen.com/shoufeilunwen/yxlbs/1431690.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户3b51f***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com