基于联合仿真的结构级故障行为研究
发布时间:2018-03-11 00:37
本文选题:联合仿真 切入点:RAS模拟器 出处:《哈尔滨工业大学》2013年硕士论文 论文类型:学位论文
【摘要】:随着计算机爆炸式的发展,计算机系统被广泛地应用于航空、金融、交通、电信、医疗、教育等与人们生活息息相关的各行各业之中,已经成为这些行业当中的信息维护和管理必不可少的基础设备。这一特性使得容错计算的发展和应用更加的广泛和深入,因此,计算机可靠性研究也成为计算机的热门研究领域。然而计算机遵循摩尔定律的快速发展,这就造成集成电路密度增加,从而导致温度产生的热效应、电流产生的功耗等引发故障的因素大大增加,使得电路触发瞬时故障、间歇故障或者永久故障的概率也大大增加。为此,开展对处理器硬件故障层次化软件容错技术的研究十分有必要。 本课题专注于处理器硬件结构级的故障行为特性的分析,工作核心是完成一个基于联合仿真的异常事件捕获系统。从剖析指令RAS(Riesling ArchitectureSimulator)集模拟器和RTL仿真器入手。第一部分工作是分析了系统组成,模拟器流水线和内部功能单元等细节。再者,设计了模拟器与仿真器通讯接口,并定义了异常信号集,作为异常捕获系统入口的重要组成部分。 最重要的研究工作是异常捕获系统的设计,其是围绕三大异常捕获模块的详细设计而展开的。TLB异常捕获模块主要工作是捕获TLB相关的异常,并进行同步TLB相关操作;中断异常捕获模块针对不同的中断分支捕获陷阱操作;Memory异常捕获模块对指令预取和读写操作设计相应的捕获单元,并维持存储同步。这样异常捕获系统在联合仿真进行故障注入之际,,能够自动地收集系统交互信息,并捕获与黄金参考模型不一致的异常症状。 最后利用联合仿真平台进行故障注入实验,并启动异常事件捕获系统。实验过程中收集异常症状信息,分析大量的实验数据,得到故障在结构级的行为表现和症状分布。同时利用故障症状信息作为BP神经网络的输入分类特征信息而进行故障诊断。
[Abstract]:With the explosive development of computers, computer systems are widely used in aviation, finance, transportation, telecommunications, medical care, education and other industries closely related to people's lives. Has become an essential infrastructure for information maintenance and management in these industries. This feature makes the development and application of fault-tolerant computing more extensive and in-depth, so, Computer reliability research has also become a hot research field of computer. However, the rapid development of computer follows Moore's law, which leads to the increase of integrated circuit density, which leads to the thermal effect of temperature. The power consumption produced by the current has greatly increased the probability of triggering transient fault, intermittent fault or permanent fault, so that the probability of triggering transient fault, intermittent fault or permanent fault is greatly increased. It is necessary to study the fault-tolerant technology of processor hardware. This paper focuses on the analysis of fault behavior at the processor hardware architecture level. The core of the work is to complete an exception event capture system based on joint simulation. The first part of the work is to analyze the composition of the system, starting with the analysis instruction RAS(Riesling Architecture Simulator set simulator and the RTL simulator. Thirdly, the communication interface between simulator and simulator is designed, and the abnormal signal set is defined as an important part of the entrance of the exception capture system. The most important research work is the design of exception capture system, which focuses on the detailed design of three exception capture modules. The main work of the. TLB exception capture module is to catch the exception related to TLB and to synchronize the TLB correlation operation. The interrupt exception capture module designs the corresponding capture unit for different interrupt branch trapping operations: memory exception capture module, instruction prefetching and reading and writing operation. The system can automatically collect the interactive information of the system and catch the abnormal symptoms which are inconsistent with the gold reference model when the fault injection is carried out by the joint simulation. Finally, the fault injection experiment is carried out by using the joint simulation platform, and the abnormal event capture system is started. In the course of the experiment, the abnormal symptom information is collected, and a large number of experimental data are analyzed. At the same time, the fault symptom information is used as the input classification feature information of BP neural network for fault diagnosis.
【学位授予单位】:哈尔滨工业大学
【学位级别】:硕士
【学位授予年份】:2013
【分类号】:TP302.8
【相似文献】
相关期刊论文 前10条
1 朱鹏,张平;基于单片机的故障注入系统[J];计算机测量与控制;2004年10期
2 王建莹,孙峻朝,李运策,杨孝宗;FTT-1:一个基于硬件的故障注入器的设计与实现[J];计算机工程与设计;1998年04期
3 王建莹,杨孝宗,徐海智;用软件实现的故障注入工具评估错误检测机制[J];小型微型计算机系统;2000年05期
4 贺R
本文编号:1595835
本文链接:https://www.wllwen.com/kejilunwen/jisuanjikexuelunwen/1595835.html