CMP上结合bank一致性技术的NUCA任意步长数据提升技术
[Abstract]:At present, the computer has become an indispensable tool for people's life and work. In use, people's demands on the computer are getting higher and higher, and it is hoped that the computer can have higher processing speed, more storage capacity, more convenient and friendly use method and so on. In order to improve the speed of the processor, the manufacturer keeps increasing the processor's main frequency, but it comes with more power consumption and becomes the bottleneck of the processor's speed. In this case, on-chip multi-core processor CMP (Chip Multi-Processor) is born, which integrates multiple processor cores on a processor chip to improve computing power. CMP has become the mainstream of the market, and the research of the CMP processing chip is also necessary. At the same time, the manufacturing process of the integrated circuit is rapidly developed, and the capacity of the on-chip cache is more and more large, but with the increase of the cache volume, the line delay of the high-capacity on-chip cache also increases with the increase of the cache volume, and the increasing line delay has a great effect on the processing speed of the CPU. In response, Kim C et al. proposed a non-consistent cache (NUCA), which allows different banks of cache to have different access delays, thus having a smaller average access delay than the previous consistency cache (UCA) late. In dynamic non-consistent cache (DNUCA), cache supports the migration of cache line (i.e., data block), that is, the hit data can be moved to the bank closer to the access processor, thereby reducing the follow-up of the CPU when the same data is accessed again by the CPU Ask for a delay. The movement of this kind of data in cache is called data promotion or Block migration. The data upgrade requires the target bank to be found to store the data to be upgraded, but some of the current data-lifting techniques do not take into account the actual state of the target bank, and the fixed lift steps used are likely to replace the more useful data in the target bank during the data upgrade cache, or replace to a bank farther from the CPU, cause cache pollution problems, so that data enhancement cannot be reached Good effect. On the basis of the structure of the CMP, we need to consider an important issue, namely, the improvement of the lifting technology, that is, The problem of sharing data. Multiple cores on a single chip share a cache of an L2 or L3 level and will have access to a share at the same time Data is the case. But the data-raising technology is to raise the data accessed by the current CPU to the bank that is closer to its own, to reach the same data next time faster access to. Then, when multiple CPUs access the same shared data, the shared data is "Lula" into the middle of the NUCA, thereby limiting the data promotion The benefits of technology. So, in the improvement of the upgrade technology, the bank consistency technology is combined to allow shared data to have multiple copies in the NUCA, each of which belongs to a different CPU, and is maintained in the NUCA by the bank consistency technology The consistency of the data of the different copies, thus solving the problems caused by the competition of the data, and improving the CPU. The speed of the access to the shared data. The consistency of the maintenance data needs to record the different states of the data, and the data promotion strategy proposed in this paper just uses the different states of the cache line to select the target bank to be migrated, so that the consistency of bank is proposed. In this paper, a brief introduction to the research background and the related technologies is given, and several basic simulation tools for the research of the system structure are introduced, and the paper is introduced in detail. Simics, a simulation tool, is introduced and the existing fixed step size data lifting technology and its problems are introduced in this paper. After the combined bank consistency, the combined bank-one on the CMP is described in detail. And finally, using the whole system simulation, the NAS Parallel Benchmark (NPB) benchmark test program is used to carry out the technology. The technology can effectively reduce the access delay of the access shared cache by the processor. Compared with the design made by Kim C and the like, the average of the IPC is increased by 8.19%, and the result is reduced.
【学位授予单位】:吉林大学
【学位级别】:硕士
【学位授予年份】:2012
【分类号】:TP332
【参考文献】
相关期刊论文 前10条
1 刘磊;;对片上多核系统的系统结构的研究[J];电脑知识与技术;2008年29期
2 喻之斌;金海;;多核处理器体系结构软件仿真技术:研究综述[J];计算机科学;2007年10期
3 何军;王飙;;多核处理器的结构设计研究[J];计算机工程;2007年16期
4 黄安文;高军;张民选;;多核处理器片上存储系统研究[J];计算机工程;2010年04期
5 吴俊杰;潘晓辉;杨学军;;面向非一致Cache的智能多跳提升技术[J];计算机学报;2009年10期
6 王军;高速缓冲存储器Cache简介[J];计算机与通信;1997年10期
7 吴俊杰;潘晓辉;;面向多核NUCA共享数据竞争问题的Bank一致性技术[J];计算机工程与科学;2009年11期
8 吴俊杰;杨学军;;非一致Cache体系结构技术综述[J];计算机工程与科学;2011年02期
9 高翔;张福新;汤彦;章隆兵;胡伟武;唐志敏;;基于龙芯CPU的多核全系统模拟器SimOS-Goodson[J];软件学报;2007年04期
10 黄琨;马可;曾洪博;张戈;章隆兵;;一种分片式多核处理器的用户级模拟器[J];软件学报;2008年04期
相关重要报纸文章 前2条
1 江南计算技术研究所 王飙 陈皖苏;[N];计算机世界;2006年
2 阿戈;[N];中国计算机报;2007年
相关硕士学位论文 前5条
1 曹皓;多核处理器体系结构下Linux调度机制的研究[D];内蒙古大学;2011年
2 刘佳;多核结构下片内存储系统的模型模拟技术研究[D];国防科学技术大学;2010年
3 史莉雯;双核处理器多级Cache的研究[D];西北工业大学;2007年
4 信磊;对称多核处理器中Cache一致性的研究与实现[D];合肥工业大学;2007年
5 蒋海涛;CMP体系结构的L2 Cache替换算法研究[D];重庆大学;2008年
,本文编号:2438420
本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/2438420.html