当前位置:主页 > 科技论文 > 计算机论文 >

多核处理器下多级Cache多维度联合划分策略

发布时间:2018-03-02 09:14

  本文关键词: Cache划分 IPC 公平性 预取 多核cache块预测器 出处:《吉林大学》2013年博士论文 论文类型:学位论文


【摘要】:随着高性能处理器技术的发展,存储墙问题成为影响处理器系统性能的主要因素之一。处理器速度通常会比存储器的访问速度快两个数量级。当代多核处理器广泛采用基于大容量最后一级共享高速缓存的结构来缩小这一差距。但是适用于小容量私有高速缓存的传统管理策略并不适用于管理大容量最后一级共享高速缓存,它可能引起高速缓存缺失数的增加,,触发大量代价昂贵的片外存储器访问。解决上述种种问题的主要方法包括对共享cache进行划分,对cache替换策略进行改进,以及设立cache块预取器等。 本文针对三级cache的负载结构设立了两个策略,联合划分策略和预测器划分策略。联合划分策略首先是一个硬件设计结构,对二级私有cache资源进行共享和整合;其次包括了末级共享cache的划分算法,该算法既考虑了失效率又考虑了公平性等因素。预测器划分策略包括了针对一二级私有cache设计的多核cache块预测器,同时结合末级共享cache划分算法。实验结果表明,联合划分策略比传统的LRU替换策略在吞吐率上获得平均17.56%的提升;比基于公平性的划分算法在吞吐率上平均提升15.69%。联合划分策略的算法公平性相对于传统LRU算法平均提升至3.8倍,相对于基于失效率的UCP算法提升至3.9倍。而预测器划分策略的算法在吞吐率和公平性上有着更大的提高,在一级和二级cache中的预测精度和覆盖率也有显著的提升。
[Abstract]:With the development of high performance processor technology, The memory wall problem has become one of the main factors affecting the performance of processor systems. Processor speed is usually two orders of magnitude faster than memory access. Modern multicore processors are widely used based on large capacity last-stage sharing. The structure of cache is used to narrow this gap. But the traditional management strategy for small private cache is not suitable for managing large capacity last stage shared cache. It may cause an increase in the number of cache deletions and trigger a large number of costly off-chip memory accesses. The main solutions to these problems include partitioning shared cache and improving cache replacement strategies. And set up cache block prefetcher and so on. In this paper, two strategies are established for the load structure of three-level cache: joint partitioning strategy and predictor partitioning strategy. Firstly, the joint partitioning strategy is a hardware design structure, which shares and integrates the two-level private cache resources. Secondly, the partition algorithm of the last level shared cache is included, which takes into account both the failure rate and the fairness. The partition strategy of the predictor includes a multi-core cache block predictor designed for the private cache of one or two levels. At the same time, combined with the last stage shared cache partitioning algorithm, the experimental results show that the joint partition strategy achieves an average throughput improvement of 17.56% compared with the traditional LRU replacement strategy. Compared with the fairness based partition algorithm, the average throughput of the joint partition algorithm is 15.699.The fairness of the joint partition strategy is 3.8 times higher than that of the traditional LRU algorithm. Compared with the UCP algorithm based on the failure rate, the algorithm of the predictor partition strategy has a greater increase in throughput and fairness, and the prediction accuracy and coverage in the primary and secondary cache are also significantly improved.
【学位授予单位】:吉林大学
【学位级别】:博士
【学位授予年份】:2013
【分类号】:TP332

【参考文献】

相关期刊论文 前2条

1 贺翔;多机系统中MESI方案探讨[J];微型机与应用;1994年07期

2 张骏;樊晓桠;刘松鹤;;面向CMP体系结构的二级CACHE替换算法设计[J];小型微型计算机系统;2007年12期



本文编号:1555902

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/1555902.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户ca9f2***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com