ESCA高性能处理器控制内核的研究与实现
发布时间:2018-01-09 20:20
本文关键词:ESCA高性能处理器控制内核的研究与实现 出处:《华中科技大学》2012年硕士论文 论文类型:学位论文
更多相关文章: 高性能计算 混合计算 多核协处理器 ESCA 控制内核 显式存储访问机制 软硬件协同验证
【摘要】:混合计算架构采用异构处理器,,充分挖掘不同架构处理器的体系结构优势,分别对控制密集和计算密集型任务进行优化处理,协同实现对应用的加速,已成为高性能计算体系结构的重要发展趋势之一。本项目组基于混合计算思想,面向工程科学计算和多媒体领域应用设计了一款高性能多核处理器-ESCA(Engineering and ScientificComputing Accelerator)。ESCA处理器以协处理器的形式对应用中计算密集型任务进行加速,采用SIMD/Vector/Sub-word等技术实现高性能。 ESCA处理器由控制内核和计算阵列两部分组成,本课题主要围绕控制内核的关键技术研究及其实现展开。 本文首先从ESCA系统的角度介绍相关模型,然后阐述了ESCA处理器的指令集、硬件框架和存储组织等体系结构关键知识。在此基础之上,确定控制内核的具体功能职责并定义了微体系结构。控制内核指令集采用分层编码,扩展控制指令以支持特殊控制流。针对大规模规整数据传输进行优化,提出了显式存储访问机制。硬件实现以流水线为主线,力求性能与开销的折衷。采用软硬件协同验证方法对控制内核的复杂控制流进行验证,设计了混合验证平台,自动化的验证流程极大地缩短了验证周期。 最终的ESCA处理器设计进行了硅原型实现,工作频率为250MHz,总面积为17676582.00μm~2,其中控制内核面积为3107821.56μm~2,硬件开销比例为17.58%。以DGEMM为评测程序,对系统实现的显式存储访问机制进行了性能评测,存储访问延迟隐藏能够达到运行总时间的56%,并获得1.5倍的加速比,表明该机制可有效弥补计算与存储访问间的速度差异,提高系统计算效率。
[Abstract]:Hybrid computing architecture uses heterogeneous processors, fully mining the architecture advantages of different architecture processors, respectively to control intensive and computation-intensive tasks to optimize processing, collaborative implementation of the accelerated application. It has become one of the most important development trends of high performance computing architecture. This project team is based on hybrid computing. A high performance multi-core processor (ESCA) is designed for engineering science computing and multimedia applications. Engineering and ScientificComputing Accelerator. ESCA processors accelerate computation-intensive tasks in applications in the form of coprocessors. Using SIMD/Vector/Sub-word and other technologies to achieve high performance. The ESCA processor consists of two parts: the control kernel and the computing array. This paper focuses on the research and implementation of the key technology of the control kernel. This paper first introduces the relevant models from the point of view of ESCA system, and then describes the key knowledge of instruction set, hardware framework and storage organization of ESCA processor. The specific functional responsibilities of the control kernel and the definition of the microarchitecture are defined. The control kernel instruction set adopts hierarchical coding and extends the control instructions to support special control flow. The control kernel instruction set is optimized for large-scale structured data transmission. An explicit storage access mechanism is proposed. The hardware implementation takes pipeline as the main line and strives for a compromise between performance and overhead. The hardware / software co-verification method is used to verify the complex control flow of the control kernel. A hybrid verification platform is designed, and the automated verification process greatly shortens the verification cycle. The final ESCA processor is designed and implemented with a silicon prototype with a working frequency of 250MHz and a total area of 17676582.00 渭 mm2. The control kernel area is 3107821.56 渭 m ~ 2, and the hardware overhead ratio is 17.58. DGEMM is used as the evaluation program. The performance of the explicit storage access mechanism implemented by the system is evaluated. The storage access delay hiding can reach 56 times of the total running time and obtain a speedup of 1.5 times. It is shown that this mechanism can effectively compensate for the speed difference between computing and storage access and improve the system computing efficiency.
【学位授予单位】:华中科技大学
【学位级别】:硕士
【学位授予年份】:2012
【分类号】:TP332
【参考文献】
相关期刊论文 前6条
1 温璞;杨学军;;V-PIM中低功耗分体多端口向量寄存器文件设计[J];计算机工程与应用;2006年04期
2 黄立波;岳虹;陆洪毅;戴葵;;一种高性能子字并行乘法器的设计与实现[J];计算机工程与应用;2007年20期
3 马胜;黄立波;王志英;刘聪;戴葵;;子字并行加法器的研究与实现[J];计算机工程与应用;2009年36期
4 饶金理;吴丹;陈攀;董冕;邓承诺;戴葵;邹雪城;;基于ESCA系统的层次化显式访存机制研究[J];计算机工程;2011年22期
5 杨学军;廖湘科;卢凯;胡庆丰;宋君强;苏金树;;The TianHe-1A Supercomputer: Its Hardware and Software[J];Journal of Computer Science & Technology;2011年03期
6 董冕;吴丹;饶金理;黄威;戴葵;邹雪城;;高性能子字并行运算单元的设计与实现[J];计算机工程;2012年16期
本文编号:1402505
本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/1402505.html