X-DSP SIMD浮点算术逻辑部件的设计与实现
发布时间:2018-06-07 17:47
本文选题:SIMD + 浮点算术逻辑部件 ; 参考:《国防科学技术大学》2013年硕士论文
【摘要】:为了满足高性能计算、军事、无线通信、视频和图像处理等领域对数字信号处理日益增长的需求,我们自主设计了X-DSP,其是一款支持SIMD的高性能64位多核DSP,采用超长指令字结构,设计主频为1.5GHz。本文依托X-DSP的开发与研制,旨在研究和设计面向DSP的高性能浮点算术逻辑部件,以满足数字信号处理器对浮点算术逻辑运算的处理需求。本文的主要工作如下: 1.对浮点算术逻辑部件的进行了深入研究。针对X-DSP的需求,将浮点算术逻辑部件的功能分为四类,分别是比较运算、加减法运算、转换运算和特殊运算,设计了相应的指令集,,在此基础上规划和设计了支持单精度SIMD操作的64位高速FALU部件的整体结构。 2.阐述了其中各个子模块的结构和详细实现方法,研究了设计优化的方法,并根据部件中各个模块的特点,使用了不同的优化策略进行结构优化,在此基础上,对部件中使用到的加法器、移位器和前导1预测等关键部件进行了详细的设计。 3.对浮点算术逻辑部件的各子模块和整体部件进行了详细的功能点验证和随机验证,在验证过程中开发了一款可视化的模块级指令模拟器,可以极大地减少验证中繁琐而重复的工作,提高了验证效率和准确性。根据验证反馈的结果对部件不断的迭代修正后,确保了功能正确性。 4.使用Cadence公司RTL Compiler工具对浮点算术逻辑部件及其子模块进行了综合。研究了综合的策略,在TSMC的45nm工艺下,综合结果表明:该部件的关键路径延迟450ps,cell面积47690μm2,总面积130350μm2,总功耗4.34mW。该结果表明本设计满足X-DSP浮点算术逻辑部件的性能要求。
[Abstract]:In order to meet the increasing demand for digital signal processing in the fields of high performance computing, military, wireless communication, video and image processing, we have designed X-DSPs, which is a high-performance 64-bit multi-core DSPs that support SIMD. The main frequency is 1.5 GHz. Based on the development and research of X-DSP, this paper aims to study and design a high performance floating-point arithmetic logic unit for DSP in order to meet the processing requirement of digital signal processor for floating-point arithmetic logic operation. The main work of this paper is as follows: 1. The floating-point arithmetic and logic parts are deeply studied. According to the requirements of X-DSP, the functions of floating-point arithmetic logic parts are divided into four categories: comparison operation, addition and subtraction operation, conversion operation and special operation, and the corresponding instruction set is designed. On this basis, the overall structure of 64 bit high speed FALU components supporting single precision SIMD operation is planned and designed. 2. In this paper, the structure and implementation method of each sub-module are described, and the method of design optimization is studied. According to the characteristics of each module in the component, different optimization strategies are used to optimize the structure. The key components, such as adder, shifter and predictor 1, are designed in detail. 3. The function point verification and random verification of each submodule and whole part of floating-point arithmetic logic unit are carried out in detail. A visual modular-level instruction simulator is developed in the process of verification. It can greatly reduce the tedious and repetitive work in verification, and improve the efficiency and accuracy of verification. According to the results of validation feedback, the components are iterated and modified to ensure the correctness of the function. 4. 4. The floating-point arithmetic logic unit and its sub-modules are synthesized by Cadence RTL Compiler tool. The synthesis strategy is studied. The synthesis results show that the critical path delay of the component is 47690 渭 m ~ (2), the total area is 130350 渭 m ~ (2), and the total power consumption is 4.34 MW. The results show that the design meets the performance requirements of X-DSP floating-point arithmetic logic unit.
【学位授予单位】:国防科学技术大学
【学位级别】:硕士
【学位授予年份】:2013
【分类号】:TP332.2
【参考文献】
相关博士学位论文 前1条
1 李振涛;高性能DSP关键电路及EDA技术研究[D];国防科学技术大学;2007年
本文编号:1992145
本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/1992145.html