面向新型PIM异构系统的任务划分与调度方法研究
发布时间:2018-03-20 02:03
本文选题:PIM 切入点:存储墙 出处:《合肥工业大学》2017年硕士论文 论文类型:学位论文
【摘要】:随着计算系统步入大数据时代,内存计算(Processing in Memory,PIM),技术被认为是缓解存储墙效应的革命性的新型架构。PIM技术或称之为近数据计算(Near Data Computing, NDC)技术的核心思想是将存储部件和计算资源紧密耦合,以此消除内存带宽瓶颈的制约和处理器与内存之间传递数据引起的开销。包含PIM结构与主处理器的系统是一种新型的异构并行计算架构,近年来研究人员不断提出结合新型存储或三维集成技术的PIM结构,但对于适用于该类新型异构平台的 -通用任务调度方法,尚缺乏研究与探讨。针对已有的定制化PIM结构存在的硬件冗余与非通用性的局限性,本文基于任务图分析的方法提出了一个形式化的模型来量化PIM+CPU异构并行计算架构的性能和能耗,并且首次提出了一个针对该架构的应用划分与映射框架。在该框架中,一个应用被划分为子任务的集合,并依据提出的执行单元映射算法PEFT(PIM-oriented Earliest-Finish-Time),将各子任务调度到合适的执行单元(CPU或PIM),使处理器与PIM结构并行地执行各子任务,以此最大化引入PIM机制带来的性能提升。PEFT算法在产生最优的子任务调度顺序基础上,为每个子任务选择可以获得最小完成时间的处理单元。评估时选取数据密集型的机器学习应用作为测试集,并且一款真正的3D DRAM产品HMC-2.0被用作内存实体进行评估。实验结果表明我们提出的应用划分与映射框架对比传统的计算架构,平均可以降低46%的应用执行时间,从而显著提升系统的性能。
[Abstract]:As the computing system entered the era of big data, Memory processing in memory is considered to be a revolutionary new architecture for mitigating the effects of storage walls. PIM, or near data Data computing, is the core idea of a tight coupling between storage components and computing resources. In order to eliminate the bottleneck of memory bandwidth and the overhead caused by data transfer between processors and memory, the system including PIM structure and main processor is a new heterogeneous parallel computing architecture. In recent years, researchers have constantly proposed a new storage or 3D integration technology of PIM structure, but for this kind of new heterogeneous platforms-general task scheduling method, In view of the limitations of hardware redundancy and non-generality of existing customized PIM structures, This paper presents a formal model to quantify the performance and energy consumption of PIM CPU heterogeneous parallel computing architecture based on task graph analysis, and proposes an application partitioning and mapping framework for this architecture for the first time. An application is divided into a set of sub-tasks, and according to the proposed execution unit mapping algorithm PEFT(PIM-oriented Earliest-Finish-Time, each sub-task is scheduled to an appropriate execution unit, CPU or pimm, so that the processor performs each sub-task in parallel with the PIM structure. In order to maximize the performance improvement caused by the introduction of the PIM mechanism, the peft algorithm is based on the optimal subtask scheduling order. Select a processing unit for each subtask to get the minimum completion time. When evaluating, select data-intensive machine learning applications as the test set, And a real 3D DRAM product, HMC-2.0, is used as a memory entity for evaluation. The experimental results show that the proposed application partition and mapping framework can reduce the average application execution time by 46%, compared with the traditional computing framework. Thus, the performance of the system is greatly improved.
【学位授予单位】:合肥工业大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP333
【参考文献】
相关期刊论文 前1条
1 王小乐;黄宏斌;邓苏;;处理顺序约束的信息物理融合系统静态任务表调度算法[J];自动化学报;2012年11期
相关博士学位论文 前2条
1 李波;基于异构多核平台的优化编程研究[D];华中科技大学;2011年
2 温璞;面向科学计算的PIM体系结构技术研究[D];国防科学技术大学;2007年
相关硕士学位论文 前1条
1 王旭涛;基于异构多核处理器系统的任务调度算法研究[D];南京邮电大学;2011年
,本文编号:1636994
本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/1636994.html