面向多核系统的科学计算核心算法并行化研究
发布时间:2018-08-23 19:56
【摘要】:加速未来大规模科学计算的一种趋势是使用异构多核/众核系统。然而,相对于硬件系统的飞速发展,软件并行编程模型,特别是针对异构多核平台的并行模型发展相对滞后。如何在异构多核环境下充分利用硬件提供的并行计算能力,提高并行计算执行效率,成为当前并行编程工作的首要任务。 为解决这一问题,本文提出了一种适用于异构多核系统的并行计算模型MS-BSP,与传统的通用BSP并行计算模型相比,可以更好地反映不同类型的任务分配到不同类型的处理器核并行处理的特征,指导在此类异构多核系统上的并行科学计算算法的设计和分析。在此种模型下,本文提出科学计算并行化编程框架。与IBM的Cell和Nvidia的CUDA架构下复杂的编程方式相比,MS-BSP模型下的编程方式将多线程的核函数映射工作交由系统自行完成,减少了开发人员对存储单元和同步机制的繁琐的显式操作,方便了编程。最后,本文在RED平台上按照MPI规范实现了并行编程与操作系统的接口,完成了对MPI函数的兼容,提高了所提出并行编程模型的可移植性。 在此套并行化框架指引下,将科学计算应用领域中的六种核心算法进行并行化设计和优化,并在“浙大数芯”实验室设计开发的RED片上多核平台和IBM的成熟商业处理器Cell平台上进行实现和对比评估,验证了我们提出的并行计算模型的实用性以及高效性,最终六个算法在两个平台上都达到了较高性能。由于MS-BSP模型在RED平台上针对其主从式异构多核架构进行优化,使得任务调度开销显著减小,其实现效率(效率定义为并行加速比与实际加速核数目的比值)不低于75.67%,而在已有的Cell平台上,其实现效率不低于63.91%。
[Abstract]:However, compared with the rapid development of hardware systems, the development of software parallel programming models, especially for heterogeneous multi-core platforms, is lagging behind. How to make full use of the parallel computing capabilities provided by hardware in heterogeneous multi-core environments? Improving the efficiency of parallel computing has become the primary task of parallel programming.
To solve this problem, this paper proposes a parallel computing model MS-BSP for heterogeneous multi-core systems. Compared with the traditional BSP parallel computing model, it can better reflect the characteristics of different types of tasks assigned to different types of processor cores for parallel processing, and guide the parallel scientific design on such heterogeneous multi-core systems. In this model, a parallel programming framework for scientific computing is proposed. Comparing with the complex programming methods of Clell and Nvidia's CUDA architecture of IBM, the programming method of MS-BSP model transfers the multi-threaded kernel function mapping to the system itself, which reduces the number of developers working on storage units and synchronizers. Finally, this paper implements the interface between parallel programming and operating system according to MPI specification on RED platform, completes the compatibility of MPI functions and improves the portability of the proposed parallel programming model.
Under the guidance of this parallelization framework, six core algorithms in the field of scientific computing applications are designed and optimized in parallel, and implemented and compared on the RED chip multi-core platform designed and developed by Zhejiang University Digital Core Laboratory and the mature commercial processor Cell platform of IBM. The proposed parallel computing model is validated. As the MS-BSP model is optimized for its master-slave heterogeneous multi-core architecture on the RED platform, the task scheduling overhead is significantly reduced, and the implementation efficiency (defined as the ratio of parallel acceleration ratio to the actual number of accelerated cores) is no less than 75%. .67%, and on the existing Cell platform, the actual efficiency is no less than 63.91%.
【学位授予单位】:浙江大学
【学位级别】:硕士
【学位授予年份】:2012
【分类号】:TP332
本文编号:2199751
[Abstract]:However, compared with the rapid development of hardware systems, the development of software parallel programming models, especially for heterogeneous multi-core platforms, is lagging behind. How to make full use of the parallel computing capabilities provided by hardware in heterogeneous multi-core environments? Improving the efficiency of parallel computing has become the primary task of parallel programming.
To solve this problem, this paper proposes a parallel computing model MS-BSP for heterogeneous multi-core systems. Compared with the traditional BSP parallel computing model, it can better reflect the characteristics of different types of tasks assigned to different types of processor cores for parallel processing, and guide the parallel scientific design on such heterogeneous multi-core systems. In this model, a parallel programming framework for scientific computing is proposed. Comparing with the complex programming methods of Clell and Nvidia's CUDA architecture of IBM, the programming method of MS-BSP model transfers the multi-threaded kernel function mapping to the system itself, which reduces the number of developers working on storage units and synchronizers. Finally, this paper implements the interface between parallel programming and operating system according to MPI specification on RED platform, completes the compatibility of MPI functions and improves the portability of the proposed parallel programming model.
Under the guidance of this parallelization framework, six core algorithms in the field of scientific computing applications are designed and optimized in parallel, and implemented and compared on the RED chip multi-core platform designed and developed by Zhejiang University Digital Core Laboratory and the mature commercial processor Cell platform of IBM. The proposed parallel computing model is validated. As the MS-BSP model is optimized for its master-slave heterogeneous multi-core architecture on the RED platform, the task scheduling overhead is significantly reduced, and the implementation efficiency (defined as the ratio of parallel acceleration ratio to the actual number of accelerated cores) is no less than 75%. .67%, and on the existing Cell platform, the actual efficiency is no less than 63.91%.
【学位授予单位】:浙江大学
【学位级别】:硕士
【学位授予年份】:2012
【分类号】:TP332
【参考文献】
相关期刊论文 前4条
1 李仁发;刘彦;徐成;;多处理器片上系统任务调度研究进展评述[J];计算机研究与发展;2008年09期
2 李春江;杨学军;;主从式单边异构多核处理器编程模型和编译架构[J];计算机工程与科学;2009年08期
3 陈芳园;张冬松;王志英;;异构多核处理器体系结构设计研究[J];计算机工程与科学;2011年12期
4 谢向辉;胡苏太;李宏亮;;多核处理器及其对系统结构设计的影响[J];计算机科学与探索;2008年06期
相关博士学位论文 前3条
1 顾雄礼;片上多处理器关键技术研究[D];浙江大学;2011年
2 高丰;基于SOC的实时操作系统的研究[D];浙江大学;2002年
3 岳虹;嵌入式异构多核处理器设计与实现关键技术研究[D];国防科学技术大学;2006年
,本文编号:2199751
本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/2199751.html