片上多核共享缓存管理策略研究
发布时间:2018-11-09 16:46
【摘要】:片上多核处理器(Chip Multi-Processor, CMP)已经成为高性能微处理器的发展潮流。高速缓存作为处理器与主存之间的重要桥梁,在计算机系统的性能优化中发挥着重要作用。一种能够有效管理CMP共享高速缓存的策略对于当今高性能微处理器而言,有着重要的意义。 目前学术界对CMP的缓存优化主要是面向多道程序的,对于多线程应用程序,已有的Cache优化技术如何提高性能,依然是开放的问题。本文围绕多线程应用程序的负载均衡,混合多线程应用程序的服务质量(QoS)等问题,针对CMP系统上运行单个或混合多线程应用程序时,共享高速缓存优化策略进行了研究,本文的主要研究内容与贡献包括: 1.针对在fork-join模式下运行的并行应用程序,各线程之间存在的负载不均衡现象。本文设计了一种关键线程指导的细粒度缓存管理策略,通过在源程序中插入检查点,由各处理器统计并行循环区域的迭代次数,准确地找到并行程序的关键线程;通过给关键线程分配更多的缓存空间,平衡并行程序各线程之间的负载和加速关键线程的执行,从而提升并行程序的整体性能。实验表明关键线程指导的细粒度CMP共享缓存管理策略能对于计算机视觉、数据挖掘等并行程序的加速比分别达到1-6%。 2.提出关键线程感知的共享缓存管理策略(CASCM),针对当前系统不能基于进程的优先级进行有效的进行缓存空间的分配的问题,不仅考虑了进程之间的优先级,,还考虑了并行程序的线程之间的优先级,基于进程和线程的不同优先级来分配缓存空间,在保证高优先级程序服务质量水平的前提下,尽量提升其它进程的性能,且无需对现有的缓存结构进行较大的改动,硬件代价小;实验表明,关键线程感知的共享缓存管理策略更有效地利用了缓存空间,相对于ATR缓存管理策略,CASCM在保证了高优先级程序服务质量水平的前提下,低优先级进程的性能可获得更多地提升。
[Abstract]:Multi-core processor (Chip Multi-Processor, CMP) on-chip has become the trend of high-performance microprocessors. Cache, as an important bridge between processor and main memory, plays an important role in the performance optimization of computer systems. A strategy to effectively manage CMP shared cache is of great significance to today's high-performance microprocessors. At present, CMP cache optimization is mainly oriented to multiprogramming in academic circles. For multithreaded applications, how existing Cache optimization technology can improve performance is still an open problem. This paper focuses on the problems of load balancing of multithreaded applications and quality of service (QoS) of hybrid multithreaded applications. This paper studies the optimization strategy of shared cache when running single or mixed multithreaded applications on CMP system. The main contents and contributions of this paper are as follows: 1. For parallel applications running in fork-join mode, there is a load imbalance between threads. In this paper, a fine-grained cache management strategy guided by critical threads is designed. By inserting checkpoints into source programs, the number of iterations in the parallel loop region is counted by each processor, and the key threads of parallel programs are found accurately. By allocating more cache space to critical threads, balancing the load between threads of parallel programs and accelerating the execution of critical threads, the overall performance of parallel programs can be improved. Experiments show that the fine-grained CMP shared cache management strategy guided by critical threads can accelerate the speed of parallel programs such as computer vision and data mining to 1-6 respectively. 2. A key thread-aware shared cache management strategy, (CASCM), is proposed to solve the problem that the current system cannot allocate cache space effectively based on the priority of the process. It also considers the priority between threads of parallel programs, allocates cache space based on different priorities of processes and threads, and improves the performance of other processes under the premise of ensuring the quality of service of high priority programs. Moreover, there is no need to change the existing cache structure, and the hardware cost is low. The experiments show that the shared cache management strategy of critical thread is more efficient than the ATR cache management strategy. Compared with the ATR cache management strategy, CASCM ensures the quality of service of high priority programs. The performance of low-priority processes can be improved more.
【学位授予单位】:湖南大学
【学位级别】:硕士
【学位授予年份】:2013
【分类号】:TP333
本文编号:2320996
[Abstract]:Multi-core processor (Chip Multi-Processor, CMP) on-chip has become the trend of high-performance microprocessors. Cache, as an important bridge between processor and main memory, plays an important role in the performance optimization of computer systems. A strategy to effectively manage CMP shared cache is of great significance to today's high-performance microprocessors. At present, CMP cache optimization is mainly oriented to multiprogramming in academic circles. For multithreaded applications, how existing Cache optimization technology can improve performance is still an open problem. This paper focuses on the problems of load balancing of multithreaded applications and quality of service (QoS) of hybrid multithreaded applications. This paper studies the optimization strategy of shared cache when running single or mixed multithreaded applications on CMP system. The main contents and contributions of this paper are as follows: 1. For parallel applications running in fork-join mode, there is a load imbalance between threads. In this paper, a fine-grained cache management strategy guided by critical threads is designed. By inserting checkpoints into source programs, the number of iterations in the parallel loop region is counted by each processor, and the key threads of parallel programs are found accurately. By allocating more cache space to critical threads, balancing the load between threads of parallel programs and accelerating the execution of critical threads, the overall performance of parallel programs can be improved. Experiments show that the fine-grained CMP shared cache management strategy guided by critical threads can accelerate the speed of parallel programs such as computer vision and data mining to 1-6 respectively. 2. A key thread-aware shared cache management strategy, (CASCM), is proposed to solve the problem that the current system cannot allocate cache space effectively based on the priority of the process. It also considers the priority between threads of parallel programs, allocates cache space based on different priorities of processes and threads, and improves the performance of other processes under the premise of ensuring the quality of service of high priority programs. Moreover, there is no need to change the existing cache structure, and the hardware cost is low. The experiments show that the shared cache management strategy of critical thread is more efficient than the ATR cache management strategy. Compared with the ATR cache management strategy, CASCM ensures the quality of service of high priority programs. The performance of low-priority processes can be improved more.
【学位授予单位】:湖南大学
【学位级别】:硕士
【学位授予年份】:2013
【分类号】:TP333
【参考文献】
相关期刊论文 前4条
1 王绍刚;徐炜遐;庞征斌;吴丹;戴艺;陆平静;;PMESI:一种优化进程私有数据访问的缓存一致性协议[J];国防科技大学学报;2013年01期
2 彭蔓蔓;李仁发;王宇明;;一种基于程序段的动态电压缩放算法[J];计算机研究与发展;2008年06期
3 唐轶轩;吴俊敏;陈国良;隋秀峰;黄景;;面向多线程程序基于效用的Cache优化策略[J];计算机研究与发展;2013年01期
4 尹巍;吴俊敏;朱小东;;消除低重用块和预测访问间隔的Cache管理策略[J];小型微型计算机系统;2012年06期
本文编号:2320996
本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/2320996.html