面向众核加速器的异构OpenCL代码生成与优化方法研究
本文选题:OpenCL 切入点:异构 出处:《西华师范大学》2017年硕士论文 论文类型:学位论文
【摘要】:近年来,因为功耗的不断增加、互连线延迟的限制以及设计复杂度的与日俱增,处理器性能的提高受到限制。传统的单核体系结构已经难以满足市场对性能的需求。随着集成电路工艺的巨大发展,使得在单芯片上集成多个处理器核心来完成更复杂、大型的计算任务成为可能,处理器实现了从单核到多核以及众核的转变。但是,通过不断增加同类型处理器核来提升性能同样是存在瓶颈的。CPU的内核数量在到达极限值后将无法再通过增加处理器核数来提升性能。为了进一步增强计算能力,硬件设计呈现出异构化的趋势。然而,由于底层的异构以及多级存储层次,异构系统中的编程难问题成为制约异构系统发展的瓶颈之一。因此,OpenCL作为首个异构并行编程框架受到了人们越来越多的青睐。作为异构计算的开放标准,OpenCL已经得到了众多厂商的大力支持,为异构系统提供了一个免费的、开放的通用标准。为了实现核心处理器的自主化,我国自主研制的“神威·太湖之光”超级计算机,采用了国产片上异构众核处理器SW26010。为了降低程序员的编程难度、同时提高软件的移植效率,本文设计并实现了支持国产SW26010众核处理器的OpenCL编译系统,并且对OpenCL优化方法进行了研究。本文的创新点主要包含以下几部分:(1)基于OpenCL的编程框架,结合国产众核处理器的微结构特征,本文提出了OpenCL平台模型、内存模型和执行模型向SW26010众核处理器的映射关系。(2)针对硬件结构的特征,本文提出了面向众核加速器的线程合并、数据布局等OpenCL优化方法。
[Abstract]:In recent years, because of the increasing power consumption, the limitation of interconnect delay and the increasing design complexity, The improvement of processor performance is limited. Traditional single-core architecture has been unable to meet the market demand for performance. With the rapid development of integrated circuit technology, it is more complicated to integrate multiple processor cores on a single chip. Large computing tasks are possible, with processors making the transition from single-core to multi-core and multi-core. Improving performance by increasing the number of cores of the same type of processors, which are also bottleneck. CPUs will not be able to improve performance by increasing the number of processor cores after reaching the limit. Hardware design shows a trend of isomerization. However, due to the underlying heterogeneity and multilevel storage levels, The difficulty of programming in heterogeneous systems has become one of the bottlenecks restricting the development of heterogeneous systems, so OpenCL as the first heterogeneous parallel programming framework has been more and more popular. As an open standard of heterogeneous computing, OpenCL has been acquired. With the strong support of many manufacturers, Provides a free, open and universal standard for heterogeneous systems. In order to realize the autonomy of the core processor, our country has developed our own "Shenwei Taihu Light" supercomputer. In order to reduce the programming difficulty of programmer and improve the efficiency of software transplantation, this paper designs and implements the OpenCL compiler system which supports the domestic SW26010 multicore processor. The innovation of this paper mainly includes the following parts: 1) the programming framework based on OpenCL, combined with the microstructural characteristics of the domestic multi-core processor, this paper proposes the OpenCL platform model. The mapping relationship between memory model and execution model to SW26010 multi-core processor. (2) aiming at the characteristics of hardware structure, this paper proposes a OpenCL optimization method such as thread merging, data layout and so on, which is oriented to the multi-kernel accelerator.
【学位授予单位】:西华师范大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP332
【参考文献】
相关期刊论文 前8条
1 Haohuan FU;Junfeng LIAO;Jinzhe YANG;Lanning WANG;Zhenya SONG;Xiaomeng HUANG;Chao YANG;Wei XUE;Fangfang LIU;Fangli QIAO;Wei ZHAO;Xunqiang YIN;Chaofeng HOU;Chenglong ZHANG;Wei GE;Jian ZHANG;Yangang WANG;Chunbo ZHOU;Guangwen YANG;;The Sunway Taihu Light supercomputer:system and applications[J];Science China(Information Sciences);2016年07期
2 郑方;许勇;李宏亮;谢向辉;陈左宁;;一种面向高性能计算的自主众核处理器结构[J];中国科学:信息科学;2015年04期
3 蔡军;许丽人;申晓莹;;大气环境仿真的工程化应用研究[J];系统仿真学报;2015年01期
4 刘颖;吕方;王蕾;陈莉;崔慧敏;冯晓兵;;异构并行编程模型研究与进展[J];软件学报;2014年07期
5 杨海平;沈占锋;骆剑承;吴炜;;海量遥感数据的高性能地学计算应用与发展分析[J];地球信息科学学报;2013年01期
6 孟小峰;慈祥;;大数据管理:概念、技术与挑战[J];计算机研究与发展;2013年01期
7 魏敏;王彬;孙婧;谷军霞;洪文董;;“天河一号”系列超级计算机系统气象领域适用性分析[J];气象科技进展;2012年01期
8 李乔;郑啸;;云计算研究现状综述[J];计算机科学;2011年04期
相关博士学位论文 前1条
1 唐滔;面向CPU-GPU异构并行系统的编程模型与编译优化关键技术研究[D];国防科学技术大学;2012年
相关硕士学位论文 前1条
1 刘丹丹;面向异构多核处理器的统一编程及分开编译设计与实现[D];中国科学技术大学;2015年
,本文编号:1641518
本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/1641518.html