当前位置:主页 > 科技论文 > 计算机论文 >

多核X-DSPX共享存储部件的设计与实现

发布时间:2018-02-22 23:03

  本文关键词: 数字信号处理 多核 共享存储 优化 综合 出处:《国防科学技术大学》2013年硕士论文 论文类型:学位论文


【摘要】:随着数字信号处理应用领域的日益扩大,对DSP(Digital Signal Processing)应用系统的性能、功耗和成本提出了越来越高的要求,促使人们从单核DSP转向多核DSP技术。同时,随着处理器并发调度线程数目的增加,怎样为片上的计算资源提供快速的共享数据访问成为多核DSP需要解决的关键技术问题之一。因此,研究多核环境下的高效共享存储技术,对推动多核DSP技术的进一步发展具有重要的现实意义。 X-DSPX是我校自主研发的新一代高性能32位浮点多核DSP微处理器芯片。本文深入分析了主流DSP共享存储部件的功能特点,根据X-DSPX设计需求,从存储阵列、数据通路、控制器这三个方面出发,对SMC(Shrared Memory Controller)模块的读写命令队列、仲裁器、命令译码、地址生成以及数据的串并转换进行了设计与实现。同时对SMC部件进行了结构上的优化和功能验证,根据公平性原则,提出了一种更有效的仲裁机制。论文的主要工作包括: 1、设计与实现了X-DSPX的SMC部件。基于X-DSPX对SMC部件的功能需求,完成四个DSP同时并发访问SMC存储体和通过SMC部件对DDR2、EMIF、远程L2存储空间的访问,并且完成DMA对SMC存储体的后台数据搬移。实现了共享存储数据的高效利用和SMC部件作为数据交叉通道的作用。 2、提出了一种降低功耗和减少存储访问延迟的优化方法。按存储体分体控制原理,对SMC部件进行结构上的调整,结果表明SMC部件的功耗降低了80%。采用流水化操作及并发仲裁,使DSP对SMC部件的请求延迟降低了一个时钟周期,实现了提高存储访问效率的目的。 3、提出了一种固定优先级与循环优先级算法相结合的仲裁机制。基于公平性原则,在不同优先级的请求信号同时请求SMC存储体时,,不因固定优先级而使请求信号出现“饿死”与“撑死”的现象,并且动态的转换请求信号的优先级来获得SMC存储体资源的先后顺序。实现不同请求信号出现结构相关时有更加合理的请求顺序。 4、完成了SMC部件的功能测试和逻辑综合。按X-DSPX系统需求完成了SMC部件的测试向量的开发以及功能验证,通过编写针对性的测试程序,使SMC部件的代码覆盖率达到99%以上。采用65nm标准单元工艺库对SMC部件进行综合,SMC部件最高工作频率可以达到555MHz。按照系统最低500MHz的内部时钟工作频率要求,SMC部件综合结果面积为65783um2,功耗为3.8692mW,符合系统设计要求。
[Abstract]:With the increasing expansion of digital signal processing applications, the performance, power consumption and cost of DSP(Digital Signal processing systems are becoming more and more important, which urges people to switch from single-core DSP to multi-core DSP technology. With the increase of the number of concurrent scheduling threads, how to provide fast access to shared data for computing resources on a chip becomes one of the key technical problems that need to be solved by multi-core DSP. Therefore, the efficient shared storage technology in multi-core environment is studied. It has important practical significance to promote the further development of multi-core DSP technology. X-DSPX is a new generation of high performance 32-bit floating-point multi-core DSP microprocessor developed by our university. This paper deeply analyzes the functional characteristics of mainstream DSP shared memory components. According to the design requirements of X-DSPX, the memory array and data path are analyzed. This paper designs and implements the command queue, arbiter, command decoding, address generation and data series-parallel conversion of SMC(Shrared Memory Controller module. At the same time, the structure and function of SMC are optimized and verified. According to the principle of fairness, a more effective arbitration mechanism is proposed. 1. The SMC part of X-DSPX is designed and implemented. Based on the functional requirement of X-DSPX to SMC part, four DSP simultaneously access SMC storage and access DDR2EMIF, remote L2 storage space through SMC. In addition, the background data transfer of SMC storage by DMA is completed, and the efficient use of shared storage data and the function of SMC as data crossover channel are realized. 2. An optimization method for reducing power consumption and memory access delay is proposed. According to the principle of storage split control, the structure of SMC parts is adjusted. The results show that the power consumption of SMC parts is reduced by 80%. Income operation and concurrent arbitration are adopted. The request delay of DSP to SMC part is reduced by one clock cycle, and the storage access efficiency is improved. 3. An arbitration mechanism based on the combination of fixed priority and cyclic priority is proposed. Based on the fairness principle, when the request signal of different priority requests the SMC storage at the same time, Do not cause the request signal to starve to death because of fixed priority. And the priority of request signal is changed dynamically to obtain the order of SMC storage resources, and there is more reasonable request order when different request signals appear structure correlation. 4. The function test and logic synthesis of SMC parts are completed. According to the requirements of X-DSPX system, the development of test vector and function verification of SMC parts are completed. The code coverage of SMC parts is more than 99%. The maximum working frequency of SMC parts can be up to 555MHz by using 65nm standard cell process library. According to the minimum internal clock frequency requirement of the system, the SMC components can be integrated. The resultant area is 65783 um2, and the power consumption is 3.8692 MW, which meets the requirement of system design.
【学位授予单位】:国防科学技术大学
【学位级别】:硕士
【学位授予年份】:2013
【分类号】:TP333

【参考文献】

相关期刊论文 前2条

1 魏晓云,陈杰,曾云;DSP技术的最新发展及其应用现状[J];半导体技术;2003年09期

2 郭阳,李暾,李思昆;微处理器功能验证方法研究[J];计算机工程与应用;2003年05期



本文编号:1525529

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/1525529.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户2553f***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com