高性能CPU存储控制器优化设计
发布时间:2018-06-17 00:07
本文选题:存储控制器 + 地址映射 ; 参考:《国防科学技术大学》2012年硕士论文
【摘要】:存储器的访问速度对处理器性能的发挥起着不可忽视的作用,在多核多线程处理器中尤甚。存储器访问的速度受存储控制器的制约。存储控制器决定计算机系统所能使用的最大内存容量、存储体数、内存的类型和速度、内存颗粒的数据深度和数据宽度等重要参数。存储控制器设计的好坏直接影响处理器性能的高低。本文的研究对象是X处理器中存储控制器的优化设计。X处理器是一款高性能处理器,可支持多线程和SIMD。它内部集成16个核,每个核拥有4个线程,运算部件由两套整数处理部件,一套向量处理部件,一套浮点处理部件和一套存取部件构成。该处理器片上内集成了4个双通道存储控制器,可支持并行访存。当处理的运算集非常大时,运算数据量会十分庞大,加大内存的访存压力;虽然多个存储控制器并行执行,在一定程度上缓解了访存压力,但是访存地址流会比较分散,使得存储控制器的功能无法充分发挥。 本文在深入研究X处理器和DDR3SDRAM的基础上,,以降低访存延时为目的,仔细分析了现有存储控制器的基本结构,做了优化改进。为了提高程序局部性、访存体并行性和行局部性,本文设计了全异或地址映射方式;为了增加访存命令行命中率,减少读写切换延迟,本文设计了分层访存调度器,分别在体内调度和体间调度两个层次对请求重新排序,设置了防饿死机制,尽可能的提高了存储器带宽利用率;为了降低活跃页频繁开启和关闭所带来的延迟,本文在片上缓冲和存储控制器之间增加了虚拟缓冲行模块,达到了增加活跃页个数的目的。 本文采用verilog描述语言对存储控制器优化设计进行了逻辑描述,对优化后整体结构进行了全面的功能验证,保证了存储控制器工作的正确性。最后,对优化前后的结构进行了详细的性能测试和对比,优化后带宽从原来的5.88GB/s达到了18.55GB/s,体现了本文优化设计的优越性。
[Abstract]:Memory access speed plays an important role in the performance of processors, especially in multi-core multithreaded processors. The speed of memory access is limited by the memory controller. The memory controller determines the maximum memory capacity, the number of memory bodies, the type and speed of memory, the data depth and width of memory particles, and so on. The quality of memory controller design directly affects the processor performance. The research object of this paper is the optimized design of storage controller in X processor. X processor is a high performance processor which can support multithreading and SIMD. It consists of 16 cores, each of which has 4 threads. The operation unit consists of two sets of integer processing units, a set of vector processing units, a set of floating-point processing units and a set of access components. Four dual-channel memory controllers are integrated on the chip to support parallel memory access. When the operation set is very large, the amount of computing data will be very large, which will increase the memory access pressure. Although several memory controllers execute in parallel, to some extent, the memory access pressure will be alleviated, but the memory access address stream will be scattered. The function of the storage controller can not be brought into full play. Based on the in-depth study of X processor and DDR3 SDRAM, in order to reduce the memory access delay, the basic structure of the existing memory controller is analyzed in detail, and the optimization improvement is made. In order to improve program locality, memory access parallelism and row locality, this paper designs a total XOR address mapping method, and in order to increase the hit rate of access command line and reduce the delay of read / write switch, a hierarchical memory access scheduler is designed. In order to reduce the delay caused by the frequent opening and closing of active pages, the request is reordered at the two levels of internal scheduling and inter-body scheduling, and the mechanism of preventing starvation is set up to improve the utilization of memory bandwidth as much as possible. In this paper, a virtual buffer line module is added between the on-chip buffer and the memory controller to increase the number of active pages. In this paper, the verilog description language is used to describe the optimal design of the storage controller, and the function of the optimized whole structure is verified, which ensures the correctness of the memory controller. Finally, the structure before and after optimization is tested and compared in detail. The optimized bandwidth reaches 18.55 GB / s from 5.88 GB / s, which shows the superiority of the optimization design in this paper.
【学位授予单位】:国防科学技术大学
【学位级别】:硕士
【学位授予年份】:2012
【分类号】:TP332
【参考文献】
相关期刊论文 前2条
1 王斌;熊志辉;陈立栋;谭树人;张茂军;;具有时间隐藏特性的数据块读写SDRAM控制器[J];计算机工程;2009年04期
2 迟学斌;赵毅;;高性能计算技术及其应用[J];中国科学院院刊;2007年04期
本文编号:2028650
本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/2028650.html