基于网格的蛋白质结构预测的并行实现
发布时间:2018-05-15 22:35
本文选题:蛋白质折叠 + 遗传算法 ; 参考:《武汉科技大学》2012年硕士论文
【摘要】:蛋白质折叠结构预测问题是当前生物学研究的一个热点。由于其特殊的结构和所使用的模型限制,利用NP问题的求解来求其最小能量值从而推测出折叠结构是这个问题的研究方向之一。许多算法用来解决这个问题,然而由于复杂的计算导致的时间的消耗代价昂贵。遗传退火算法(GAA)是遗传算法和模拟退火算法的结合,同时具备了遗传算法全局搜索能力强和退火算法局部收敛快的优势,因此成为蛋白质折叠结构预测问题常用算法。 网格计算是一种分布式并行计算,它致力于利用网络上的闲置资源来解决大规模计算问题。网格并行系统的设计以网格中间件为基础,具备管理性强、安全性高、数据传输方便和扩展性好的特点,从计算的效率、设备的代价来看,适合大规模计算问题的普遍研究。基于MPI的网格编程接口是实现网格并行计算的一种应用。 本文利用遗传退火算法来解决蛋白质折叠结构预测问题,将串行算法移植到并行的网格平台中。在并行算法中,种群被分为多个子种群分布到子节点中各自进行演化操作。同时根据实际情况对算法的算子(选择、交叉、变异)进行改进,,以求得更好的计算速度和算法效率。
[Abstract]:Protein folding structure prediction is a hot topic in current biological research. Because of its special structure and the limitation of the model used, the solution of NP problem is used to calculate its minimum energy value, and it is inferred that folding structure is one of the research directions of this problem. Many algorithms are used to solve this problem, but the time consumption due to complex computation is expensive. Genetic annealing algorithm (GA) is a combination of genetic algorithm and simulated annealing algorithm. It has the advantages of strong global search ability and fast local convergence of annealing algorithm, so it has become a common algorithm for protein folding structure prediction. Grid computing is a kind of distributed parallel computing, which is dedicated to solving large-scale computing problems by using idle resources on the network. The design of grid parallel system is based on grid middleware, which has the characteristics of strong management, high security, convenient data transmission and good expansibility. From the point of view of computing efficiency and equipment cost, it is suitable for the general research of large-scale computing problems. Grid programming interface based on MPI is an application to realize grid parallel computing. In this paper, the genetic annealing algorithm is used to solve the protein folding structure prediction problem, and the serial algorithm is transplanted to the parallel grid platform. In the parallel algorithm, the population is divided into multiple subpopulations and distributed to the child nodes to perform evolutionary operations. At the same time, the operators (selection, crossover, mutation) of the algorithm are improved according to the actual situation, in order to obtain better calculation speed and algorithm efficiency.
【学位授予单位】:武汉科技大学
【学位级别】:硕士
【学位授予年份】:2012
【分类号】:Q51;TP338.6
本文编号:1894264
本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/1894264.html