当前位置:主页 > 科技论文 > 计算机论文 >

M-DSP定点运算单元及混洗单元的设计验证与优化

发布时间:2018-10-09 21:48
【摘要】:随着航空航天、通信、医疗等领域中数据处理量的增大以及实时信息处理能力的需求提升,使得高性能DSP(Digital Signal Processing)成为国内外的研究热点。M-DSP是一款自主研发的32位高性能DSP,采用11发射的超长指令字(VLIW)体系结构,拥有强大的并行计算能力,在40nm工艺下主频达1GHz。本文基于M-DSP的研发平台,完成IALU单元和混洗单元的设计、优化、验证工作,主要内容如下:一、根据M-DSP的设计需求,设计了IALU单元的指令集和微体系结构,并实现了两种各具优点的IALU单元设计方案。一种是以Kogge-Stone树为核心的加法器分立式IALU结构,具有较好的时序,而且便于采用门控精细控制功耗,但面积较大;另一种是二级超前进位加法器IALU结构,IALU大多数指令通过复用加法器实现,IALU单元面积较小,但其结构复杂,时序相对较差。本文根据M-DSP的设计需要最终采用第一种实现方案。二、目前,传统的混洗指令需要使用Load指令提前加载混洗模式,这种方式占用过多的系统寄存器资源并且执行周期较长。为克服上述问题,本文设计了一款配置和执行相分离,并且拥有特定的混洗模式地址寄存器和混洗模式存储体的高效混洗单元。三、针对本文所设计的IALU单元和混洗单元的特点,设计了完整详细的验证方案。主要采用模拟验证的方法,分别从模块级到系统级对IALU单元和混洗单元进行验证。模块级验证包括功能点、ATEC和随机数验证;系统级包括全局信号和指令组合验证等,并对验证情况做覆盖率统计,分析消除验证盲点。另外,采用形式化验证的方法,验证综合后网表和RTL级代码的一致性。四、分别对IALU单元和混洗单元设计采用树状选择结构、逻辑优化和流水线技术等方法进行时序优化,并采用门控时钟、逻辑重组、操作数隔离和状态码优化等方法进行RTL级功耗优化。最后在40nm CMOS工艺下,使用Design Complier综合工具对IALU单元和Shuffle单元进行综合,其中IALU关键路径延时为400ps,总面积为7004.2372um2;Shuffle单元关键路径延时为430ps,总面积为151811.721um2,结果表明其性能、面积达到M-DSP的设计要求。
[Abstract]:With the increase of data processing capacity in aerospace, communication, medical and other fields, as well as the need for real-time information processing capacity, High performance DSP (Digital Signal Processing) has become a research hotspot at home and abroad. M-DSP is a self-developed 32-bit high performance DSP, (VLIW) architecture with 11-transmitted super-long instruction word. It has powerful parallel computing capability and the main frequency reaches 1 GHz in the 40nm process. Based on the research and development platform of M-DSP, this paper completes the design, optimization and verification of IALU unit and washing unit. The main contents are as follows: 1. According to the design requirements of M-DSP, the instruction set and microarchitecture of IALU unit are designed. Two IALU element design schemes with each advantage are implemented. One is a discrete IALU structure with Kogge-Stone tree as the core, which has better timing, and is convenient to use gated fine control power consumption, but the area is large. The other is that the IALU structure of the two-stage carry-ahead adder is small in area, but its structure is complex and the timing is relatively poor. According to the design needs of M-DSP, the first implementation scheme is adopted in this paper. Second at present the traditional shuffling instruction needs to load the washing mode in advance with Load instruction which takes up too much system register resource and has a long execution period. In order to overcome the above problems, this paper designs an efficient shuffling unit with a separate configuration and execution phase, and has a specific address register of the shuffling mode and the memory of the shuffling mode. Thirdly, according to the characteristics of the IALU unit and the washing unit designed in this paper, a complete and detailed verification scheme is designed. The IALU unit and the washing unit are verified from module level to system level by the method of simulation verification. Module level verification includes function point ATEC and random number verification, and system level includes global signal and instruction combination verification. In addition, the method of formal verification is used to verify the consistency between the net table and the RTL level code. Fourthly, the IALU unit and the washing unit are designed using tree selection structure, logic optimization and pipeline technology, respectively, and the timing is optimized by gating clock, logic recombination, etc. Operand isolation and state code optimization are used to optimize power consumption at RTL level. Finally, the Design Complier synthesis tool is used to synthesize the IALU unit and the Shuffle unit in 40nm CMOS process. The critical path delay of IALU is 400ps. the total area is 7004.2372um2Shuffle, the critical path delay is 430psand the total area is 151811.721um2. The result shows its performance. The area meets the design requirements of M-DSP.
【学位授予单位】:国防科学技术大学
【学位级别】:硕士
【学位授予年份】:2015
【分类号】:TP332

【相似文献】

相关期刊论文 前10条

1 杨俊波;苏显渝;;自由空间非对称与对称混洗网络的拓扑等价[J];光电子.激光;2007年01期

2 冯向萍;张太红;;混洗算法在考场编排中的应用[J];福建电脑;2008年06期

3 康辉,章江英,战元龄;用棱镜实现高效率的完全混洗互连网络[J];光学学报;1995年03期

4 万江华;刘胜;周锋;王耀华;陈书明;;具有高效混洗模式存储器的可编程混洗单元[J];国防科技大学学报;2011年06期

5 李源,曹明翠,罗风光,陈清明;反-逆混洗光电混合循环排序网[J];光学学报;1999年05期

6 曹树国;;基于考场编排的改进分治混洗算法研究[J];计算机应用与软件;2014年06期

7 杨俊波;刘菊;杨建坤;李修建;苏显渝;徐平;;非对称型多级混洗网络拓扑结构与路由研究[J];光电子.激光;2010年05期

8 P-Y.Chen;D.H.Lawrie;D.A.Padna;P-C.Yew;张德芳;万湘林;张滨;;混洗互连网络[J];计算机工程与科学;1983年03期

9 冯向萍;张太红;李萍;;高考考场编排算法研究[J];新疆农业大学学报;2008年03期

10 ;[J];;年期

相关重要报纸文章 前1条

1 苏东华 陈章浩;衣袜混洗易致病[N];医药经济报;2006年

相关硕士学位论文 前2条

1 汪峰;M-DSP定点运算单元及混洗单元的设计验证与优化[D];国防科学技术大学;2015年

2 彭浩;X-DSP 64位SIMD位处理部件及混洗单元的设计与实现[D];国防科学技术大学;2013年



本文编号:2260906

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/2260906.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户a0ae3***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com