利用Hi-C技术高通量筛选介导染色质相互作用的lncRNAs
发布时间:2018-07-17 21:34
【摘要】:细胞核是真核生物特有的,最大的细胞器。染色质是遗传和表观遗传信息的载体,并且是最大的生物大分子。大约2m长的染色质被折叠进直径小于10μm的细胞核。染色体构象捕获(3C)技术以及一系列它的衍生技术(4C、5C、Hi-C)的发展促进了核结构的研究。这些技术揭示了一些蛋白如CTCF,Cohesin等在染色质折叠和相互作用中发挥重要的作用。最近发现一些lncRNAs也参与了染色质的相互作用。lncRNAs通过与DNA、蛋白质,甚至与RNA本身相互作用,参与了染色质相互作用并调控了核结构的形成,比如XIST,Firre等。因为基因组的大部分编码为ncRNA,我们猜测也许很多lncRNA参与了核结构的调控。但是直到现在,也没有任何系统的关于lncRNA参与染色质相互作用的报道。为了在全基因组范围内筛选可能参与染色质相互作用的lncRNAs,我们建立了一种基于Hi-C技术的高通量筛选的方法,它是通过对比RNase处理前后基因组范围的染色质相互作用来实现的。建立高质量的Hi-C文库是整个课题的关键。Hi-C技术是一个多步骤、耗时较长的分子生物学技术,需要多种试剂和仪器。这个技术还不是很成熟,到现在为止它的重复性还不是很好很稳定。通过优化复杂的Hi-C实验中最核心的步骤如交联、酶切、限制性内切酶的失活和原位连接等,我们建立了一个成熟的、稳定的Hi-C建库流程。在初始Hi-C文库进行扩增之后,将GM12878细胞的RNase处理前后的两组生物学重复Hi-C文库进行了高通量测序。在初步的生物信息学分析之后,文库的质量及生物学重复的重复性得到检验。平均来说,原始数据中大概有90%的比对率,72%的配对率。此外,在去除自连片段和dangling-ends后,能够获得超过96%的有效相互作用对。相关性分析显示,两组生物学重复的bin coverage和all bin pairs的相关性都极强。所有这些结果进一步证明了优化了的Hi-C建库流程是可靠并且稳定的。在获得了高质量并且高度重复性的文库后,我们进一步对照分析了RNase处理前后样品文库的结果。对比分析显示,在RNase处理之后很多相互作用减弱甚至是消失了。在建库过程中也发现,和RNase处理组相比,同样细胞量的正常组得到了1.58倍的初始Hi-C文库(相互作用片段)。在两组生物学重复的正常组和RNase处理组扣减后,我们选择正值前10000对差异相互作用进行下一步的分析。最后发现在RNase处理后消失或减弱的染色质相互作用位点附近存在4081个lncRNAs编码基因。GO注释显示,这4081个lncRNAs编码基因附近的基因主要和细胞膜、Pleckstrin homology-like domain、铵离子转运、可变剪接等生物学结构或功能相关。筛选到的这4081个lncRNAs是潜在的可能参与染色质相互作用的,这为进一步研究它们的分子机制和功能提供了一个很好的基础。
[Abstract]:The nucleus is unique to eukaryotes and is the largest organelle. Chromatin is the carrier of genetic and epigenetic information and the largest biological macromolecule. About 2m long chromatin is folded into nuclei smaller than 10 渭 m in diameter. The development of chromosome conformation capture (3C) technique and a series of its derivation techniques (4Cn5CU Hi-C) have promoted the study of nuclear structure. These techniques reveal that some proteins, such as CTCF Cohesin, play an important role in chromatin folding and interaction. Recently, it has been found that some lncRNAs are also involved in chromatin interaction. LncRNAs interact with DNA, protein and even RNA itself, participate in chromatin interaction and regulate the formation of nuclear structure, such as XIST Firre. Since most of the genome encodes ncRNAs, we suspect that many lncRNAs may be involved in the regulation of nuclear structures. Until now, however, there have been no systematic reports of lncRNA involved in chromatin interactions. In order to screen lncRNAss which may be involved in chromatin interaction in the whole genome, we established a high-throughput screening method based on Hi-C technology, which was achieved by comparing the genome-wide chromatin interactions before and after RNase treatment. Establishing a high quality Hi-C library is the key of the whole project. Hi-C technology is a multi-step, time-consuming molecular biology technology, which requires a variety of reagents and instruments. The technology is not very mature, and so far its repeatability is not very good, very stable. By optimizing the core steps of the complex Hi-C experiment, such as crosslinking, restriction endonuclease inactivation and in-situ connection, we have established a mature and stable Hi-C database construction process. After the initial Hi-C library was amplified, two sets of biological repeat Hi-C libraries were sequenced before and after RNase treatment in GM12878 cells. After a preliminary bioinformatics analysis, the quality of the library and the repeatability of biological duplication were examined. On average, about 90% of the raw data were matched by 72%. In addition, after removing the self-connected fragments and dangling-ends, more than 96% of the effective interaction pairs can be obtained. Correlation analysis showed that there was a strong correlation between bin coverage and all bin pairs in both groups. All these results further prove that the optimized Hi-C library building process is reliable and stable. After the high quality and reproducibility of the library were obtained, we compared and analyzed the results of the sample library before and after RNase treatment. Comparative analysis showed that many interactions weakened or disappeared after RNase treatment. It was also found that the initial Hi-C library (interaction fragment) was 1.58 times higher than that in the RNase treated group. After deducting two groups of biologically duplicated normal and RNase treated groups, we selected the first 10000 positive values for the next analysis of the differential interactions. Finally, it was found that there were 4081 lncRNAs coding genes near the chromatin interaction sites that disappeared or weakened after RNase treatment. Go annotation showed that the genes near the 4081 LNRNAs coding genes were mainly related to the membrane Pleckstrin homology-like domain, ammonium ion transport. Variable splicing or other biological structures or functions are related. The selected 4081 lncRNAs are potentially involved in chromatin interaction, which provides a good basis for further study of their molecular mechanisms and functions.
【学位授予单位】:聊城大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:Q78
本文编号:2131005
[Abstract]:The nucleus is unique to eukaryotes and is the largest organelle. Chromatin is the carrier of genetic and epigenetic information and the largest biological macromolecule. About 2m long chromatin is folded into nuclei smaller than 10 渭 m in diameter. The development of chromosome conformation capture (3C) technique and a series of its derivation techniques (4Cn5CU Hi-C) have promoted the study of nuclear structure. These techniques reveal that some proteins, such as CTCF Cohesin, play an important role in chromatin folding and interaction. Recently, it has been found that some lncRNAs are also involved in chromatin interaction. LncRNAs interact with DNA, protein and even RNA itself, participate in chromatin interaction and regulate the formation of nuclear structure, such as XIST Firre. Since most of the genome encodes ncRNAs, we suspect that many lncRNAs may be involved in the regulation of nuclear structures. Until now, however, there have been no systematic reports of lncRNA involved in chromatin interactions. In order to screen lncRNAss which may be involved in chromatin interaction in the whole genome, we established a high-throughput screening method based on Hi-C technology, which was achieved by comparing the genome-wide chromatin interactions before and after RNase treatment. Establishing a high quality Hi-C library is the key of the whole project. Hi-C technology is a multi-step, time-consuming molecular biology technology, which requires a variety of reagents and instruments. The technology is not very mature, and so far its repeatability is not very good, very stable. By optimizing the core steps of the complex Hi-C experiment, such as crosslinking, restriction endonuclease inactivation and in-situ connection, we have established a mature and stable Hi-C database construction process. After the initial Hi-C library was amplified, two sets of biological repeat Hi-C libraries were sequenced before and after RNase treatment in GM12878 cells. After a preliminary bioinformatics analysis, the quality of the library and the repeatability of biological duplication were examined. On average, about 90% of the raw data were matched by 72%. In addition, after removing the self-connected fragments and dangling-ends, more than 96% of the effective interaction pairs can be obtained. Correlation analysis showed that there was a strong correlation between bin coverage and all bin pairs in both groups. All these results further prove that the optimized Hi-C library building process is reliable and stable. After the high quality and reproducibility of the library were obtained, we compared and analyzed the results of the sample library before and after RNase treatment. Comparative analysis showed that many interactions weakened or disappeared after RNase treatment. It was also found that the initial Hi-C library (interaction fragment) was 1.58 times higher than that in the RNase treated group. After deducting two groups of biologically duplicated normal and RNase treated groups, we selected the first 10000 positive values for the next analysis of the differential interactions. Finally, it was found that there were 4081 lncRNAs coding genes near the chromatin interaction sites that disappeared or weakened after RNase treatment. Go annotation showed that the genes near the 4081 LNRNAs coding genes were mainly related to the membrane Pleckstrin homology-like domain, ammonium ion transport. Variable splicing or other biological structures or functions are related. The selected 4081 lncRNAs are potentially involved in chromatin interaction, which provides a good basis for further study of their molecular mechanisms and functions.
【学位授予单位】:聊城大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:Q78
【参考文献】
相关期刊论文 前1条
1 SONG XiaoWei;SHAN DongKai;CHEN Jian;JING Qing;;miRNAs and lncRNAs in vascular injury and remodeling[J];Science China(Life Sciences);2014年08期
,本文编号:2131005
本文链接:https://www.wllwen.com/shoufeilunwen/benkebiyelunwen/2131005.html