当前位置:主页 > 科技论文 > 软件论文 >

异构多核系统中关键算法的硬件加速引擎设计

发布时间:2018-11-17 15:32
【摘要】:随着现代数字信号处理向大容量数据、高速实时计算发展,高速、高性能计算及其实现手段成为近代数学和信息处理技术的重要命题。传统的单核处理系统已经难以满足庞大的信息处理量和复杂信号的处理。多核芯片技术的产生,为解决这一问题提供了有效解决方案,并得到了业界广泛的认可。特别是集成了多种体系结构和不同功能的处理器核的异构多核系统,可以将不同的计算任务分配给不同的处理器核并行处理,通过异构计算单元加速任务执行,能够为多种应用提供更加灵活、高效的处理机制,满足多种应用需求,成为未来多核技术主要发展方向之一。随着复杂信号高性能、高密度计算需求的日益膨胀,传统的将计算任务映射在多核系统中不同的处理核上,已难以满足高速实时处理的要求。因此,多核和加速器形式的架构应运而生,一些多核处理器集成了定制的加速核来加速特定的应用,但其灵活性不高。随着可重构技术的出现,将可重构计算技术应用于硬件加速器中,能够弥补通用运算与软件计算在性能和灵活性上的鸿沟,为复杂高速信号的处理提供更高性能的平台。本文针对上述问题,进行了有关可重构计算技术和异构多核系统中硬件加速器的研究。论文的主要工作如下:首先,本文针对应用需求特征,提炼面向高密度计算的应用特征和部分算法特征,分析出可重用程度高、可有效提高系统性能的计算类型,并对这些运算类型的算法进行了分析和优化,提出了改进的矩阵求逆算法、函数拟合算法和针对算法的硬件架构。其次,本文在优化算法和结构的基础上,设计了一款面向异构多核系统的可重构硬件加速引擎,该硬件加速引擎主要面向高密度计算领域中矩阵类运算。特别是矩阵求逆运算,能够高效地完成16阶、32阶、64阶、128阶单精度实数矩阵求逆运算。此外,在不增加运算和存储资源的情况下,重构了拟合和多目运算,丰富了该可重构硬件加速引擎的功能。最后,本文对设计的硬件加速引擎进行了实验测试和性能分析,并介绍了该硬件加速引擎在异构多核系统中的集成,验证了所设计的硬件加速引擎具有较高的性能。
[Abstract]:With the development of modern digital signal processing to large capacity data and high speed real-time computing, high speed, high performance computing and its means of implementation have become an important proposition of modern mathematics and information processing technology. The traditional single-core processing system has been difficult to meet the huge amount of information processing and complex signal processing. The emergence of multi-core chip technology provides an effective solution to solve this problem, and has been widely recognized by the industry. In particular, heterogeneous multicore systems with different architectures and different functions can assign different computing tasks to different processor cores for parallel processing, and speed up task execution through heterogeneous computing units. It can provide more flexible and efficient processing mechanism for many applications and meet the needs of many applications. It will become one of the main development directions of multi-nuclear technology in the future. With the high performance of complex signals and the increasing demand for high-density computing, the traditional mapping of computing tasks to different processing cores in multi-core systems is difficult to meet the requirements of high-speed real-time processing. Therefore, multicore and accelerator architectures emerge as the times require. Some multicore processors integrate custom accelerated cores to accelerate specific applications, but their flexibility is not high. With the advent of reconfigurable technology, the application of reconfigurable computing technology to hardware accelerators can bridge the gap between general computing and software computing in performance and flexibility, and provide a higher performance platform for complex high-speed signal processing. In this paper, the reconfigurable computing techniques and hardware accelerators in heterogeneous multicore systems are studied. The main work of this paper is as follows: firstly, according to the characteristics of application requirements, this paper abstracts the application features and some algorithm features for high-density computing, and analyzes the computing types with high degree of reuse, which can effectively improve the performance of the system. The algorithms of these types of operations are analyzed and optimized, and an improved matrix inverse algorithm, a function fitting algorithm and a hardware architecture for the algorithm are proposed. Secondly, based on the optimization algorithm and structure, a reconfigurable hardware acceleration engine for heterogeneous multi-core systems is designed. The hardware acceleration engine is mainly oriented to matrix operations in the field of high-density computing. Especially, the inverse operation of matrix can efficiently perform the inverse operation of 16, 32, 64 and 128 order of real matrix with single precision. In addition, the refactoring of fitting and multi-eye operation without adding computing and storage resources enriches the function of the reconfigurable hardware acceleration engine. Finally, the hardware acceleration engine is tested and its performance is analyzed, and the integration of the hardware acceleration engine in heterogeneous multi-core system is introduced. It is verified that the designed hardware acceleration engine has high performance.
【学位授予单位】:合肥工业大学
【学位级别】:硕士
【学位授予年份】:2016
【分类号】:TP301.6


本文编号:2338314

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/2338314.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户a1d79***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com