当前位置:主页 > 科技论文 > 计算机论文 >

大规模并行片上系统的分布式并行模拟关键技术研究

发布时间:2018-01-05 03:15

  本文关键词:大规模并行片上系统的分布式并行模拟关键技术研究 出处:《国防科学技术大学》2012年硕士论文 论文类型:学位论文


  更多相关文章: 大规模并行片上系统 分布式并行模拟器 指令语义映射 原子指令 软件分布式共享存储


【摘要】:主频陷入增长停滞后,以多核和众核体系结构为代表的大规模并行片上系统成为微处理器的研究和实现的热点。根据摩尔定律,增加的芯片资源转化为单芯片中处理器核数目的增长,千核数量级处理器已不再遥远。随着目标处理器核数目的增加,传统串行模拟器的性能将会急剧恶化。而大规模并行片上系统的设计空间却扩大了数倍,这必然导致体系结构设计空间探索的效率急剧降低,模拟器这一重要研究手段面临巨大挑战。 利用Host平台的并行计算能力可以开发目标机模型中天然存在的粗粒度并行性,本文提出面向大规模并行片上系统的分布式并行模拟(DPS)加速框架,试图通过改变传统串行模拟机制来提高模拟器的性能,重点研究了分布式并行模拟中,原子指令模拟执行效率问题,和共享状态模拟效率问题。 针对加锁方法模拟原子指令执行效率不高的问题,本文提出了一种基于指令语义映射(ISM)的原子指令并行模拟执行技术。该技术将目标机中的原子指令和宿主机的原子指令做一对一的语义映射,以宿主机原子指令的执行代替目标机原子指令的模拟。这种方法实现简单,原子指令并行模拟的正确性易于保证,相比于加锁方法具有更好的性能,最高可达到30%的性能提升。 针对分布式并行模拟中各宿主机存储不共享、无法模拟共享存储目标机的问题,本文提出了一种基于软件分布式共享存储(SDSM)的共享状态高效模拟技术。实现了Host级和Simulator级抽象级别的两种软件分布式共享存储模型,实验结果显示,,文中提出的技术可以在分布式宿主机上正确高效模拟共享存储目标机。 基于上述的模拟框架和分布式并行模拟技术,论文在FTsim模拟器的基础上实现了分布并行模拟器DPFTsim。实验结果显示,DPS框架、ISM和SDSM技术能够有效地对大规模并行片上系统模拟进行分布式并行加速。在启动10个模拟线程时,DPFTsim的性能达到了串行模拟器FTsim的4.5倍。
[Abstract]:After the main frequency has stagnated, the large-scale parallel on-chip system, represented by multi-core and multi-core architecture, has become the hotspot of microprocessor research and implementation. According to Moore's law. The increased chip resources are transformed into the increase of the number of processor cores in a single chip, thousands of core processors are no longer remote. With the increase of the number of target processor cores. The performance of the traditional serial simulator will deteriorate dramatically, but the design space of the large-scale parallel on-chip system will be expanded several times, which will inevitably lead to a sharp decline in the efficiency of the exploration of architecture design space. The simulator, an important research tool, faces a great challenge. The coarse-grained parallelism in the target machine model can be developed by using the parallel computing ability of Host platform. In this paper, a distributed parallel simulation (DPS) acceleration framework for large-scale parallel on-chip systems is proposed, which attempts to improve the performance of the simulator by changing the traditional serial simulation mechanism. The performance efficiency of atomic instruction simulation and the efficiency of shared state simulation in distributed parallel simulation are studied. To solve the problem that the locking method is not efficient in simulating the execution of atomic instructions. In this paper, a parallel simulation and execution technique of atomic instruction based on instruction semantic mapping (ISM) is proposed, in which the atomic instruction in the target machine and the atomic instruction in the host are mapped one-to-one. This method is simple to realize and the correctness of the parallel simulation of atomic instructions is easy to ensure. Compared with the locking method, this method has better performance. Up to 30% performance improvements. In order to solve the problem that the storage of each host is not shared in distributed parallel simulation, it is impossible to simulate the shared storage target machine. In this paper, a software distributed shared storage (SDSM) is proposed. Two software distributed shared storage models at Host level and Simulator level are implemented. The experimental results show that the proposed technology can simulate the shared storage target machine correctly and efficiently on the distributed host computer. Based on the above simulation framework and distributed parallel simulation technology, the distributed parallel simulator DPFTsim. the experimental results show that the distributed parallel simulator DPFTsim. the experimental results show that the DPS framework. ISM and SDSM techniques can effectively speed up large-scale parallel on-chip system simulation, when starting 10 simulation threads. The performance of DPFTsim is 4.5 times higher than that of serial simulator FTsim.
【学位授予单位】:国防科学技术大学
【学位级别】:硕士
【学位授予年份】:2012
【分类号】:TP332

【参考文献】

相关期刊论文 前4条

1 庞九凤;佟冬;李皓;何浪;程旭;;面向基于x86处理器和AMBA的系统芯片的全系统模拟器PKUsim-86[J];电子学报;2011年02期

2 赵天磊;唐遇星;徐炜遐;付桂涛;齐树波;贾小敏;张民选;;程序执行的精确重现技术及其在体系结构模拟中的应用[J];计算机学报;2011年11期

3 许建卫;陈明宇;郑规;曹政;吕慧伟;孙凝晖;;SimK:A Large-Scale Parallel Simulation Engine[J];Journal of Computer Science & Technology;2009年06期

4 高翔;张福新;汤彦;章隆兵;胡伟武;唐志敏;;基于龙芯CPU的多核全系统模拟器SimOS-Goodson[J];软件学报;2007年04期



本文编号:1381321

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/1381321.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户dd480***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com