当前位置:主页 > 管理论文 > 移动网络论文 >

基于事件处理的分布式系统故障定位技术

发布时间:2018-07-04 22:28

  本文选题:分布式网络 + 实时监控系统 ; 参考:《计算机科学》2013年S1期


【摘要】:近年来,分布式计算系统的规模越来越大、行为越来越复杂难控,系统中出现的各种故障也呈指数级增长,造成了非常严重的危害和损失,并且出现问题时对故障的排查、定位难度进一步加大。传统的通过跟踪程序运行轨迹来判断程序运行正确与否的方法,在分布式监控信息的交互上因消耗过大而且对目标程序侵入性高,已经难以满足软件行为分析的需求。通过复杂事件的处理及时发现和定位系统故障在事件大量、快速、不间断发生的分布式监控环境中显得尤为迫切。它可以利用有意义的信息状态变化事件分析系统行为,进而判断系统的运行状况,及时发现系统故障并定位,保证系统的健康运行。当前已有的复杂事件描述语言大多数是基于SQL的方法来描述复杂事件。这种数据流查询语言对于普通用户而言比较复杂,难以掌握。通过构建一种基于集合的事件流模型,对事件进行形式化定义,使用集合来表示事件,并定义相应的操作,使得用户只需掌握几个简单的集合操作,便可以定义复杂的故障规则。
[Abstract]:In recent years, the scale of the distributed computing system is getting larger and larger, the behavior is more and more complex and difficult to control, and all kinds of faults in the system also increase exponentially, resulting in very serious harm and losses, and troubleshooting when problems occur. Positioning is more difficult. The traditional method to judge whether the program is running correctly or not by tracking the track of the program is difficult to meet the requirement of software behavior analysis because of the excessive consumption of the distributed monitoring information and the high intrusion to the target program. It is very urgent to find and locate the faults of the system through the processing of complex events in the distributed monitoring environment, which has a large number of events, fast and uninterrupted. It can use meaningful information state change events to analyze the system behavior, and then judge the system running condition, find the system fault and locate the system in time, and ensure the healthy operation of the system. Most of the existing complex event description languages are based on SQL to describe complex events. This data stream query language is more complex for ordinary users and difficult to master. By constructing a set based event flow model, the event is formally defined, the event is represented by the set, and the corresponding operation is defined, so that the user can master only a few simple set operations. You can define complex fault rules.
【作者单位】: 国家计算机网络应急技术处理协调中心;中国科学院信息工程研究所;中国科学院大学;
【基金】:国家“242”信息安全计划基金项目(2010A029) 中国科学院战略性科技先导专项(XDA06030200)资助
【分类号】:TP393.08

【二级参考文献】

相关期刊论文 前4条

1 苏利敏,侯朝桢,戴忠健,潘秀琴;基于神经网络的告警关联[J];北京理工大学学报;2002年03期

2 管恩政,周春光,王U,

本文编号:2097706


资料下载
论文发表

本文链接:https://www.wllwen.com/guanlilunwen/ydhl/2097706.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户15a4f***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com