解释型指令集全系统仿真器的设计与实现
发布时间:2019-03-24 19:44
【摘要】:随着嵌入式系统应用的日益广泛,嵌入式应用系统所包含的功能也越来越多,且嵌入式应用系统的更新换代的周期越来越短。这导致了嵌入式应用系统巨大的设计与开发压力,要求进行软、硬件的协同开发,这促使指令集仿真器得以快速的发展,指令集仿真器也广泛应用在微处理器新体系结构的设计与验证领域。因此,研究如何提供一种快速的指令集全系统仿真器具有重要的理论与实际意义。 针对于解释型指令集仿真技术具有很好的灵活性与精确性的优点,及其存在仿真速度较慢的不足,设计与实现了一种基于共享块级cache技术的解释型指令集仿真器IISimulator。该仿真器充分利用应用程序执行时的时间局部性原理与空间局部性原理,对解释型指令集仿真技术中译码阶段的指令译码结果,,以块为单位进行缓存,当再一次执行到该指令块时,直接调用该指令块的译码结果执行仿真,从而有效地跳过解释型指令集仿真技术中耗时的译码阶段;同时使用共享内存池的方法管理指令的译码结果使用的内存,有效地减少因使用块级cache技术所带来的内存管理开销。 在IISimulator仿真器的测试阶段,通过选择了一些具有代表性的目标机应用程序对仿真器的性能进行测试。通过运行这些测试实验用例,统计仿真器在无cache、指令级cache和块级cache三种情况下仿真执行速度,并进行对比分析,结果表明块级cache技术能够很好的提高解释型指令集仿真器的仿真速度;同时,也对在使用和不使用共享内存池时仿真器的仿真执行速度进行了对比,实验结果表明共享内存池能够有效地减少因cache所带来的内存管理开销;最后将IISimulator与其它一些全系统仿真器skyeye和SimpleScalar进行对比,其平均速度要快。这说明新对解释型指令集仿真器的改进大大提高了仿真器的执行效率。
[Abstract]:With the increasing application of embedded system, more and more functions are included in embedded application system, and the cycle of updating embedded application system is shorter and shorter. This leads to the huge pressure of design and development of embedded application system, which requires the collaborative development of software and hardware, which promotes the rapid development of instruction set simulator. Instruction set emulator is also widely used in the design and verification of microprocessor's new architecture. Therefore, it is of great theoretical and practical significance to study how to provide a fast instruction set full system simulator. In view of the advantages of good flexibility and accuracy of interpretive instruction set simulation technology, and the disadvantage of slow simulation speed, an interpreted instruction set simulator IISimulator. based on shared block level cache technology is designed and implemented. The simulator makes full use of the time locality principle and the space locality principle when the application is executed, and buffers the decoding results in the decoding stage of the interpreted instruction set simulation technology in the block unit, and makes full use of the time locality principle and the space locality principle during the execution of the application. When the instruction block is executed again, the decoding result of the instruction block is directly invoked to perform the simulation, thus effectively skipping the time-consuming decoding stage in the interpreted instruction set simulation technology. At the same time, the memory used by decoding results of instructions is managed by the method of shared memory pool, which effectively reduces the memory management overhead caused by the use of block-level cache technology. In the testing phase of the IISimulator simulator, some representative target applications are selected to test the performance of the simulator. By running these test lab use cases, the statistical emulator simulates the execution speed without cache, command-level cache and block-level cache, and makes a comparative analysis. The results show that block-level cache technology can improve the simulation speed of interpretive instruction set simulator. At the same time, the simulation execution speed of the simulator is compared when the shared memory pool is used and not used. The experimental results show that the shared memory pool can effectively reduce the memory management overhead caused by cache. Finally, IISimulator is compared with other whole-system simulators skyeye and SimpleScalar, and the average speed is faster. This shows that the new improvements to the interpreted instruction set emulator greatly improve the execution efficiency of the simulator.
【学位授予单位】:华中科技大学
【学位级别】:硕士
【学位授予年份】:2012
【分类号】:TP368.1;TP337
本文编号:2446621
[Abstract]:With the increasing application of embedded system, more and more functions are included in embedded application system, and the cycle of updating embedded application system is shorter and shorter. This leads to the huge pressure of design and development of embedded application system, which requires the collaborative development of software and hardware, which promotes the rapid development of instruction set simulator. Instruction set emulator is also widely used in the design and verification of microprocessor's new architecture. Therefore, it is of great theoretical and practical significance to study how to provide a fast instruction set full system simulator. In view of the advantages of good flexibility and accuracy of interpretive instruction set simulation technology, and the disadvantage of slow simulation speed, an interpreted instruction set simulator IISimulator. based on shared block level cache technology is designed and implemented. The simulator makes full use of the time locality principle and the space locality principle when the application is executed, and buffers the decoding results in the decoding stage of the interpreted instruction set simulation technology in the block unit, and makes full use of the time locality principle and the space locality principle during the execution of the application. When the instruction block is executed again, the decoding result of the instruction block is directly invoked to perform the simulation, thus effectively skipping the time-consuming decoding stage in the interpreted instruction set simulation technology. At the same time, the memory used by decoding results of instructions is managed by the method of shared memory pool, which effectively reduces the memory management overhead caused by the use of block-level cache technology. In the testing phase of the IISimulator simulator, some representative target applications are selected to test the performance of the simulator. By running these test lab use cases, the statistical emulator simulates the execution speed without cache, command-level cache and block-level cache, and makes a comparative analysis. The results show that block-level cache technology can improve the simulation speed of interpretive instruction set simulator. At the same time, the simulation execution speed of the simulator is compared when the shared memory pool is used and not used. The experimental results show that the shared memory pool can effectively reduce the memory management overhead caused by cache. Finally, IISimulator is compared with other whole-system simulators skyeye and SimpleScalar, and the average speed is faster. This shows that the new improvements to the interpreted instruction set emulator greatly improve the execution efficiency of the simulator.
【学位授予单位】:华中科技大学
【学位级别】:硕士
【学位授予年份】:2012
【分类号】:TP368.1;TP337
【参考文献】
相关期刊论文 前6条
1 王利明,宋振宇,李明,陈渝;一个开放源码的嵌入式仿真环境——SkyEye[J];单片机与嵌入式系统应用;2003年09期
2 钱斌,付宇卓;一种基于虚指令集技术构建快速的可重用的指令集仿真器的方法[J];计算机工程与应用;2005年12期
3 师小丽;张发存;;LS RISC微处理器仿真研究[J];计算机应用;2008年10期
4 陶峰峰,付宇卓;DSP指令集仿真器的设计与实现[J];计算机仿真;2005年09期
5 王旭;计算机指令集仿真器的时间仿真技术研究[J];计算机应用与软件;2005年08期
6 何海涛;;周期精确的流水线仿真模型[J];微计算机信息;2009年16期
相关硕士学位论文 前1条
1 金方其;可重配置的时钟精确嵌入式处理器仿真平台的研究[D];浙江大学;2006年
本文编号:2446621
本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/2446621.html