X-DSP一级数据Cache的设计与实现
发布时间:2018-04-14 18:26
本文选题:DSP + Cache ; 参考:《国防科学技术大学》2013年硕士论文
【摘要】:随着集成电路技术的发展,,DSP性能的进一步提升面临着越来越严重的“存储墙”问题。“Cache+RAM”的存储结构是解决这个问题的重要途径之一。设计一种高效、灵活的一级数据Cache(L1D Cache),对提高DSP的访存效率和整体性能具有重要作用。 X-DSP是国防科大微电子所自主研发的32位高性能DSP,采用超长指令字结构(VLIW),支持两路并行的Load/Store访存请求。X-DSP采用片内二级存储结构,即一级存储和二级存储,其中一级存储包括一级指令Cache和一级数据Cache,二级存储器为Cache/SRAM可配置共享存储器。本文围绕L1D Cache的设计实现进行研究,主要工作包括以下几方面: 1、在分析X-DSP总体结构和存储层次设计需求的基础上,设计实现了一种可根据应用需求灵活配置容量的L1D Cache。该L1D Cache采用两路组相联的映象规则、伪LRU替换算法、写回和不按写分配策略,支持两路并行访存。 2、设计实现了一种软硬件结合的L1D Cache数据一致性维护机制:L1D Cache既支持来自二级存储器(L2SRAM)的侦听操作,以保持与DMA读写L2SRAM时的数据一致性;又为程序员提供了丰富的控制寄存器,可对L1D Cache进行全局或者部分的写回或者作废操作。同时还设计实现了对一级数据存储器及控制寄存器的保护机制,保证只有符合权限配置的请求才能访问存储空间以及对寄存器进行读写操作。 3、针对X-DSP访存指令特点,提出了一种支持跨边界访问的解决方案,即把一个跨边界的非对齐访问拆分为两个对齐的访问,该方案具有效率高、硬件开销小且不会增加编译器的额外负担等特点。 4、针对L1D Cache处理访存指令的命中与缺失特点设计实现了访存流水线和缺失流水线,并设计了一个宽度为128bit,深度为4且支持写合并的写缺失缓冲队列,有效地减少了写缺失的等待时间。 最后进行了模块级功能验证和逻辑综合,结果表明,L1D Cache功能正确,主频达到了1GHz,满足X-DSP的设计要求。
[Abstract]:With the development of integrated circuit technology and the further improvement of the performance of Cache, the problem of "storage wall" is becoming more and more serious. The storage structure of "Cache RAM" is one of the important ways to solve this problem.The design of an efficient and flexible primary data Cache(L1D is very important to improve the memory access efficiency and overall performance of DSP.X-DSP is a 32-bit high-performance DSP developed by the Institute of Microelectronics of National Defense University of Science and Technology. It uses ultra-long instruction word structure and supports two parallel Load/Store memory access requests. X-DSP uses in-chip secondary storage architecture, namely, primary storage and secondary storage.The first level storage includes one level instruction Cache and one level data Cache. the second level memory is Cache/SRAM configurable shared memory.This paper focuses on the design and implementation of L1D Cache, the main work includes the following aspects:1. On the basis of analyzing the overall structure and storage hierarchy design requirements of X-DSP, a L1D Cache-based system is designed and implemented, which can flexibly configure the capacity according to the application requirements.The L1D Cache uses two sets of associated mapping rules, pseudo LRU replacement algorithm, write-back and non-write-assignment strategies, and supports two parallel memory access.2. We design and implement a consistency maintenance mechanism of L1D Cache data:: L1D Cache, which combines hardware and software, not only supports the listening operation from the secondary memory (L2SRAM), but also provides the programmer with abundant control registers, so as to keep the consistency with the data when DMA reads and writes L2SRAM.L 1D Cache can be global or partial write-back or invalidated.At the same time, the protection mechanism of the first level data memory and control register is designed and implemented, which ensures that only the request according to the permission configuration can access the storage space and read and write the register.3. In view of the characteristics of X-DSP memory access instruction, a solution is proposed to support cross-boundary access, that is, the unaligned access across a boundary is divided into two aligned access, which has high efficiency.The hardware cost is small and does not increase the additional burden on the compiler and so on.4. According to the hit and absence characteristics of L1D Cache processing memory access instruction, this paper designs and implements pipeline and pipeline, and designs a write deletion buffer queue with a width of 128 bits, a depth of 4 and support for write merging.Effectively reduces the write missing wait time.Finally, the module level function verification and logic synthesis are carried out. The results show that the L1D Cache functions correctly and the main frequency reaches 1 GHz, which meets the design requirements of X-DSP.
【学位授予单位】:国防科学技术大学
【学位级别】:硕士
【学位授予年份】:2013
【分类号】:TP333
【参考文献】
相关期刊论文 前3条
1 汤伟;李俊峰;;基于总线监听的Cache一致性协议分析[J];福建电脑;2009年07期
2 杨文华,罗晓沛;专用集成电路的设计验证方法及一种实际的通用微处理器设计的多级验证体系[J];计算机研究与发展;1999年06期
3 彭军;杨乐;稂婵新;盛立琨;;基于总线侦听Cache一致性协议算法与实现[J];计算机与现代化;2007年10期
本文编号:1750490
本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/1750490.html