分布式复杂事件流处理引擎的研究
发布时间:2018-05-04 13:47
本文选题:复杂事件处理 + 流式计算 ; 参考:《北京工业大学》2016年硕士论文
【摘要】:随着云计算时代的到来,很多研究人员在近年来开始关注大数据领域。但是大数据在发展的过程中还存在着很多的挑战,比如如何利用信息技术等手段处理XML等半结构化数据、数据的表示和转化、如何高效地处理这些数据、适合不同行业应用的开发环境等等。复杂事件处理是在事件驱动架构下,结合简单事件、事件流处理以及复合事件进行处理,是大数据处理的关键技术之一。它通过提取符合特定模式的事件序列并对其进行实时检测,能够满足海量数据处理中高吞吐量、低延迟的需求。目前有许多研究人员提出了各种复杂事件流处理语言和流式计算平台,但都有自己的局限性。针对现有的复杂事件处理引擎存在的各种问题,本文提出并设计了一个以复杂事件流处理语言CEStream为基础的分布式复杂事件流处理引擎,实现了基于正规树模式的事件检测功能,能够同时支持时间序列的正规式匹配和半结构化数据的结构约束,并且可以捕获来自不同事件源的数据,检测符合特定时间序列的正规式模式的组合型事件。针对多数据源组合型事件的检测需求,本文还提出了一个模式分解算法,可以将复杂事件处理任务分解为多个独立的事件检测任务,部署在集群中不同节点和远端的事件检测代理上,从而减少单源数据的传输消耗,并通过集群的并行计算功能提高多源事件检测效率。实验结果表明,系统实现了CEStream语言的查询功能,能够完成正规树模式匹配和多源组合型事件检测等特色功能,并通过模式分解算法有效提高了系统的事件检测效率,达到了低延时和高吞吐量的设计目标,可以满足目前主流的复杂事件流处理的应用场景。
[Abstract]:With the arrival of cloud computing era, many researchers began to pay attention to big data field in recent years. However, big data still has many challenges in the process of development, such as how to use information technology and other means to deal with semi-structured data such as XML, how to represent and transform data, how to deal with these data efficiently. Suitable for different industry application development environment and so on. Complex event processing is one of the key techniques for big data to deal with simple events, event flow and composite events under the event-driven architecture. It can meet the requirements of high throughput and low latency in mass data processing by extracting and detecting event sequences according to specific patterns. At present, many researchers have proposed a variety of complex event flow processing languages and flow computing platforms, but they all have their own limitations. Aiming at the various problems existing in the existing complex event processing engine, this paper proposes and designs a distributed complex event flow processing engine based on the complex event flow processing language CEStream, and realizes the event detection function based on the normal tree pattern. It can support both the normal matching of time series and the structural constraints of semi-structured data, and can capture data from different event sources and detect combinational events with normal pattern of specific time series. Aiming at the detection requirements of multi-data source composite events, this paper also proposes a schema decomposition algorithm, which can decompose complex event processing tasks into multiple independent event detection tasks. It is deployed on different nodes and remote event detection agents in the cluster to reduce the transmission cost of single source data and to improve the efficiency of multi-source event detection through the parallel computing function of the cluster. The experimental results show that the system realizes the query function of CEStream language, can complete the characteristic functions such as regular tree pattern matching and multi-source combinational event detection, and effectively improves the efficiency of event detection through pattern decomposition algorithm. The design goal of low delay and high throughput is achieved, which can meet the application scenarios of complex event flow processing.
【学位授予单位】:北京工业大学
【学位级别】:硕士
【学位授予年份】:2016
【分类号】:TP311.13
【相似文献】
相关期刊论文 前10条
1 黄鹏;王鹏;汪卫;;面向事件流的频繁片断计数算法[J];计算机科学与探索;2010年10期
2 袁中旺;宋绍云;王晓燕;陈道鑫;;ExtJS事件机制的探究[J];电脑知识与技术;2011年09期
3 徐骏,周晓峥,于俊清,周洞汝;基于事件流的新闻视频场景分割方法[J];计算机辅助设计与图形学学报;2003年02期
4 陈波,彭澄廉,王海洪,刘宏亮;分布式监测系统中的事件流描述语言的设计[J];计算机工程;1995年S1期
5 贾e,
本文编号:1843196
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/1843196.html