高性能DSP内核的优化设计与流片测试
发布时间:2018-06-20 07:31
本文选题:DSP内核 + 代码优化 ; 参考:《国防科学技术大学》2012年硕士论文
【摘要】:YHFT-DX DSP内核的性能是YHFT-DX高性能数字信号处理器设计性能的关键,它采用65nm工艺,要求频率从原有频率600MHz提升到目标频率800MHz。论文以YHFT-DXDSP内核的数据通路和一级Cache的性能优化为背景,对RTL级代码优化方法,电路设计优化技术,以及全定制与半定制相结合的设计方法进行了研究,,主要完成了以下工作: 1.对DSP内核的功能部件和一级Cache做RTL级代码优化,重点优化时序紧张的访存部件和乘法部件,通常采用的有效方法包括:减少端口复用,逻辑移栈、逻辑复制、算法并行化等。通过RTL级代码优化,DSP内核在1ns时钟周期内时序收敛。 2.利用全定制与半定制相结合的方法实现DSP内核关键路径上的加法器和乘法器。访存部件中的32位加法器采用基于标准单元的手工半定制实现,版图后模拟表明,该加法器延时和全定制加法器的延时基本一致,但设计周期大大缩短。16位SIMD乘法器的关键路径采用全定制实现,非关键路径采用半定制实现,版图后模拟结果表明该乘法器延时比半定制设计减少了24%。合理利用这两种方法,可以在提升设计性能的同时减小设计周期。 3.为了验证YHFT-DX DSP内核的优化工作是否有效,设计了一款内核测试芯片,并对流片之后的测试芯片做板级测试,测试结果显示YHFT-DX DSP内核能够在1.0v典型CMOS工艺条件下达到900MHz,且功能正确,达到了目标频率。综上所述,DSP内核的优化方法是切实可行和有效的。
[Abstract]:The performance of YHFT-DX DSP kernel is the key to the design performance of YHFT-DX high performance digital signal processor. It adopts 65nm technology and requires the frequency to be increased from 600MHz to 800MHz. Based on the data path of the YHFT-DX DSP kernel and the performance optimization of the first level cache, this paper studies the RTL-level code optimization method, circuit design optimization technology, and the design method of the combination of full-customization and semi-customization. The main works are as follows: 1. RTL-level code optimization of DSP kernel and one level cache is done, and the memory access and multiplication components with tight timing are optimized. The effective methods usually adopted include reducing port reuse. Logic shift stack, logical copy, algorithm parallelization, etc. The RTL-level code is used to optimize the timing convergence of the 1ns kernel during the clock cycle. 2. The adder and multiplier on the critical path of the 1ns kernel are realized by the combination of full customization and semi-customization. The 32-bit adder in memory access unit is realized by manual semi-customization based on standard cell. The simulation after layout shows that the delay of the adder is basically the same as that of the fully customized adder. However, the design cycle is greatly shortened. The critical path of the .16-bit SIMD multiplier is realized by full customization, while the non-critical path is implemented by semi-customization. The simulation results after layout show that the delay of the multiplier is 24% less than that of semi-custom design. Reasonable use of these two methods can improve the design performance while reducing the design cycle. 3. In order to verify the effectiveness of the optimization of the YHFT-DX DSP kernel, a kernel test chip is designed. The experimental results show that the YHFT-DX DSP core can reach 900MHz in 1.0v typical CMOS process, and the function is correct and the target frequency is achieved. To sum up, the optimization method of DSP kernel is feasible and effective.
【学位授予单位】:国防科学技术大学
【学位级别】:硕士
【学位授予年份】:2012
【分类号】:TP332
【参考文献】
相关期刊论文 前5条
1 应征,吴金,常昌远,魏同立;高速乘法器的性能比较[J];电子器件;2003年01期
2 黄立波;岳虹;陆洪毅;戴葵;;一种高性能子字并行乘法器的设计与实现[J];计算机工程与应用;2007年20期
3 郭阳;甄体智;李勇;;YHFT-DX高性能DSP指令控制流水线设计与优化[J];计算机工程与应用;2010年07期
4 路卢;彭思龙;;32位稀疏树加法器的设计改进与实现[J];微电子学与计算机;2007年12期
5 李楠;喻明艳;;16×16快速乘法器的设计与实现[J];微电子学与计算机;2008年04期
本文编号:2043490
本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/2043490.html