基于GPU的多序列关联性分析方法研究
发布时间:2018-06-01 02:55
本文选题:多序列关联性分析 + 多序列比对 ; 参考:《华中科技大学》2013年硕士论文
【摘要】:多序列关联性分析方法是基于多序列比对思想,分析序列间远近关系及探索序列关联路线的一种策略。随着序列数目的不断增加,现有基于CPU的多序列关联性分析方法已无法满足实际应用的需求。随着图形处理器(GPU)计算能力的飞速提高,GPU以其流水线工作模式和强大的并行计算能力,被广泛应用于解决计算密集型问题,包括提高多序列关联性分析方法的效率。 结合GPU强大的并行计算能力,提出并实现基于GPU的多序列关联性分析方法,从三个不同角度进行并行优化。其中对关联性分析的算法进行改进,通过对算法执行过程的调整,解决算法内部的数据依赖问题;为降低I/O负载及实现异步处理,提出基于GPU的数据流并行优化策略,对输入距离矩阵进行数据分割,并结合异步处理模式,实现CPU与GPU的协同并行处理;基于GPU的指令流优化策略实现对不同线程粒度的动态调用,解决在未知多序列关联关系的情况下,线程拥塞和线程空载等问题。同时设计基于并行双调排序的最小链模型,通过并行遍历子矫正距离矩阵,将遍历结果存入最小链数组以进行双调排序,快速定位当前状态下的最小值结点对,对多序列关联性分析方法中最耗时的处理过程进行了并行优化。 基于Linux操作系统和CUDA平台,采用C、C++等语言,实现基于GPU的多序列关联性分析方法。在保证输出结果精确度不变的情况下,减少了输入数据的I/O传输时间,降低了寻找最小值结点对的时间开销,实验整体性能与基于CPU的多序列关联分析方法相比,加速比达到25.1,且具有更稳定、更快速的关联性分析性能。
[Abstract]:Multi-sequence correlation analysis is a strategy based on the idea of multi-sequence alignment to analyze the distance and near relationship between sequences and to explore the route of sequence association. With the increasing number of sequences, the existing multi-sequence correlation analysis methods based on CPU can not meet the needs of practical applications. With the rapid improvement of GPU computing power, GPU is widely used to solve computationally intensive problems, including improving the efficiency of multi-sequence correlation analysis with its pipelined mode and powerful parallel computing capability. Combined with the powerful parallel computing ability of GPU, a multi-sequence correlation analysis method based on GPU is proposed and implemented, and parallel optimization is carried out from three different angles. In order to reduce the I / O load and realize asynchronous processing, the parallel optimization strategy of data flow based on GPU is proposed. The input distance matrix is partitioned and the asynchronous processing mode is combined to realize the collaborative parallel processing between CPU and GPU. The instruction flow optimization strategy based on GPU realizes the dynamic call to different thread granularity. In the case of unknown multi-sequence association, the problem of thread congestion and thread no-load is solved. At the same time, the minimum chain model based on parallel bimodal sorting is designed. The traversal result is stored in the minimum chain array to sort the minimum value node pairs in the current state by parallel traversal subcorrecting distance matrix. Parallel optimization of the most time-consuming processing process in the multi-sequence correlation analysis method is carried out. Based on Linux operating system and CUDA platform, the method of multi-sequence correlation analysis based on GPU is realized by using C + C and other languages. The I / O transmission time of the input data is reduced and the time cost of finding the minimum node pair is reduced. The overall performance of the experiment is compared with that of the multi-sequence association analysis method based on CPU. The speedup ratio is 25. 1, and it has more stable and fast correlation analysis performance.
【学位授予单位】:华中科技大学
【学位级别】:硕士
【学位授予年份】:2013
【分类号】:TP332
【参考文献】
相关期刊论文 前2条
1 马海晨;韦刚;吴百峰;;基于GPGPU的生物序列快速比对[J];计算机工程;2012年04期
2 林江;唐敏;童若锋;;GPU加速的生物序列比对[J];计算机辅助设计与图形学学报;2010年03期
,本文编号:1962589
本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/1962589.html