含假结的RNA二级结构预测算法研究

发布时间:2018-03-27 17:18

  本文选题:RNA二级结构 切入点:假结 出处:《南京航空航天大学》2017年硕士论文


【摘要】:核糖核苷酸(ribonucleic acid,RNA)作为一类生物大分子,在各种细胞生命过程中扮演着重要的角色,包括遗传信息的表达、传递、基因调控与催化等。与DNA不同,RNA的结构更加复杂多样,这也是RNA具有丰富功能特性的物质基础。首先由于物理实验方法检测RNA空间结构成本较高,其次,仅依靠物理实验无法满足海量的待测序列数据。因此RNA二级结构的算法模拟预测成为一个重要且具有挑战性的课题。并且,在RNA二级结构中有一类由茎区交叉嵌套产生的子结构叫做假结,由于假结被证实在很多RNA催化过程中起到关键的作用,因此近期在RNA结构预测领域越来越受到重视。本文将遗传算法应用到RNA二级结构预测当中,并且包含了两类假结结构,并通过实验测试,验证算法的可用性与有效性。其次针对RNA结构预测的效率问题,提出基于OpenCL的异构并行加速,对串行算法进行改进和优化,分析串行预测算法中可并行的部分,对计算任务进行重新划分,通过CPU+GPU模式进行异构加速,最终通过实验测试对比串行算法与并行算法效率的高低。论文主要工作如下:(1)实现一种改进的遗传算法,相对于传统的遗传算法,改进了遗传操作,使得预测算法更加接近RNA分子二级结构的折叠过程。算法基于最小自由能思想,结合MathewsTurner和DirksPierce两种能量参数,预测包含H型假结在内的两种假结结构。最终从RNA STRAND数据库中选取的测试集测试获得0.81的阳性预测率、0.79的敏感性。说明算法有效可用,可以作为RNA二级结构分析的参考之一。(2)针对基于遗传算法带假结的RNA二级结构预测低效的问题,提出基于Open CL的异构并行加速算法,首先进行上述串行算法的并行性分析,得到在螺旋区点阵填充及种群迭代进化两个最耗时的阶段可以进行异构加速,然后改进算法过程,在GPU设备上基于Open CL编程框架对上述两个过程进行改进和提升。最终以相同的测试集进行测试,相对于串行算法,改进后的异构并行加速算法平均可获得2.8x倍的加速。有效降低了RNA二级结构预测的耗时,提升了算法模拟预测的效率。
[Abstract]:Ribonucleic acid RNA (RNAs), as a class of biological macromolecules, plays an important role in the process of cell life, including the expression, transmission, regulation and catalysis of genetic information. This is also the material basis for the rich functional characteristics of RNA. Firstly, because of the high cost of physical experimental method to detect RNA spatial structure, secondly, Relying on physical experiments alone can not satisfy the mass of data to be tested, so the algorithm simulation and prediction of RNA secondary structure has become an important and challenging topic. In the secondary structure of RNA, a class of substructures generated by cross-nesting of stem region is called pseudoknot, which has been proved to play a key role in many RNA catalytic processes. Therefore, more and more attention has been paid to the field of RNA structure prediction recently. In this paper, genetic algorithm is applied to RNA secondary structure prediction, and two kinds of false junction structures are included. Secondly, aiming at the efficiency problem of RNA structure prediction, this paper proposes a heterogeneous parallel acceleration based on OpenCL, improves and optimizes the serial algorithm, and analyzes the parallelism part of the serial prediction algorithm. This paper redivides the computing tasks, accelerates the isomerism through CPU GPU mode, and finally compares the efficiency of the serial algorithm with the parallel algorithm through experimental tests. The main work of this paper is as follows: 1) to implement an improved genetic algorithm. Compared with the traditional genetic algorithm, the genetic operation is improved, which makes the prediction algorithm more close to the folding process of the secondary structure of RNA molecule. The algorithm is based on the idea of minimum free energy and combines the two energy parameters of MathewsTurner and DirksPierce. Finally, the sensitivity of 0.81 positive prediction rate of 0.79 is obtained from the test set selected from RNA STRAND database. It can be used as one of the reference of RNA secondary structure analysis. (2) aiming at the problem of low efficiency prediction of RNA secondary structure with false junction based on genetic algorithm, a heterogeneous parallel acceleration algorithm based on Open CL is proposed. Firstly, the parallelism of the above serial algorithm is analyzed. It is concluded that the two most time-consuming stages of helical lattice filling and population iterative evolution can be accelerated by isomerism, and then the algorithm process is improved. The above two processes are improved and upgraded based on the Open CL programming framework on the GPU device. Finally, the same test set is used to test, compared with the serial algorithm, The improved heterogeneous parallel acceleration algorithm can achieve an average acceleration of 2.8 x, which effectively reduces the time consuming of RNA secondary structure prediction and improves the efficiency of the algorithm simulation prediction.
【学位授予单位】:南京航空航天大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:Q522;TP18

【参考文献】

相关期刊论文 前6条

1 夏飞;朱强华;金国庆;;基于CPU-GPU混合计算平台的RNA二级结构预测算法并行化研究[J];国防科技大学学报;2013年06期

2 夏飞;窦勇;宋健;雷国庆;;基于FPGA的细粒度并行CYK算法加速器设计与实现[J];计算机学报;2010年05期

3 董荦;葛万成;陈康力;;CUDA并行计算的应用研究[J];信息技术;2010年04期

4 陈国良;孙广中;徐云;龙柏;;并行计算的一体化研究现状与发展趋势[J];科学通报;2009年08期

5 刘元宁;张浩;李妼;崔广迪;苗轶蝉;;RNA假结结构分析[J];吉林大学学报(工学版);2009年S1期

6 邹权;郭茂祖;张涛涛;;RNA二级结构预测方法综述[J];电子学报;2008年02期

相关博士学位论文 前3条

1 刘琦;RNA二级结构的若干计算生物学问题研究[D];浙江大学;2008年

2 方小永;基于比较序列分析的RNA二级结构预测与评估[D];国防科学技术大学;2007年

3 刘海军;RNA二级结构预测的建模及其应用研究[D];上海大学;2005年

相关硕士学位论文 前5条

1 赵成龙;基于AMD平台的OpenCL优化研究及其在分子动力学中的应用[D];南京航空航天大学;2015年

2 吴兰;基于HSA的Kaveri测试与优化[D];苏州大学;2014年

3 苑寅;带假结RNA二级结构预测研究[D];电子科技大学;2013年

4 彭政;带假结的RNA二级结构预测算法研究[D];湖南大学;2008年

5 张涛涛;基于比较序列分析的RNA二级结构预测算法研究[D];哈尔滨工业大学;2007年



本文编号:1672494

资料下载
论文发表

本文链接:https://www.wllwen.com/shoufeilunwen/benkebiyelunwen/1672494.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户721b4***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com