最大信息系数改进算法及其在铁路事故分析中的应用
本文关键词: 铁路事故 预警 相关性 MIC 图模型 聚类算法 出处:《北京交通大学》2016年博士论文 论文类型:学位论文
【摘要】:铁路运输在整个交通运输体系中占有重要的地位,随着我国铁路的大规模建设,铁路运输进入了跨越式快速发展阶段,铁路运营里程不断增加,货运及客运量不断增长。然而,与此同时,重、特大铁路事故仍然偶有发生,这给人民生命和财产安全造成极大的损失,确保铁路运输安全仍然是铁路运输中的一项重要工作。当前,各种先进电子电气设备不断地应用到铁路系统中,影响铁路安全的因素越来越多。面对如此多影响铁路安全的因素,首先需要分析这些因素之间的相关性,相比其它统计相关系数,最大信息系数(theMaximalInformationCoefficient,MIC)具有良好的性质:广泛性(Generality)和均匀性(Equitability),MIC可以发现不同类型的相关关系。本文具体分析了 Reshef等人提出的两变量最大信息系数MIC的定义及其近似算法,针对其存在的不足,提出了计算大规模数据中两变量以及多变量最大信息系数MIC的快速算法,并基于最大信息系数MIC,进行了铁路事故分析及预警研究。具体来说,本文主要创新点如下。1.提出了计算两变量最大信息系数MIC的数学规划模型并设计了面向大规模数据的快速算法。通过分析Reshef等人提出的两变量最大信息系数MIC的定义,明确了求解两变量最大信息系数MIC的目标以及各种约束条件,给出了数学规划模型;针对Reshef等人提出的计算两变量最大信息系数MIC近似算法计算时间较长的问题,利用k-均值聚类算法,分别对两个变量进行划分,得到两个变量的格子划分,提出了计算大规模数据中两变量最大信息系数MIC的快速算法。数值实验表明,本文提出的快速算法计算得到的两变量最大信息系数MIC保留了 MIC的两个优良的性质:广泛性和均匀性;不同类型两变量相关关系最大信息系数MIC的计算时间非常接近,而且,随着数据规模的增大,计算时间的增长速度不快;分析了算法的时间复杂度,Reshef等人提出的近似算法的时间复杂度为O(n2.4),本文提出的快速算法的时间复杂度是O(n1.6),本文提出的快速算法更适合发掘大规模数据中的两变量相关关系。2.给出了多变量最大信息系数MIC的定义,并提出了计算大规模数据中多变量最大信息系数MIC的快速算法。利用互信息的链式法则,将多变量互信息分解为一个变量与多个变量之间互信息的和,从而将多变量分为因变量和自变量两部分,得到多变量最大信息系数MIC的定义。利用二分k-均值聚类算法,将自变量和因变量分别划分为不同数量的块,提出了计算大规模数据中多变量最大信息系数MIC的快速算法。数值实验结果表明,提出的快速算法计算得到的多变量最大信息系数MIC保持了 MIC的优越性质:广泛性和均匀性,并且计算时间较短,计算时间增长速度较慢,本文提出的快速算法适合发掘大规模数据中的多变量相关关系。3.提出了基于最大信息系数MIC的铁路事故复杂网络模型。事故因素作为网络节点,根据两点之间最大信息系数MIC值产生网络中的边,分析了不同依赖性水平下的网络结构变化情况,具体分析了网络节点的度、度分布、孤立点、连通图以及网络平均连接度等指标的变化情况。对某一固定因素,随着依赖性水平的不断增长,该因素的重要影响因素可以被识别出来。4.提出了一种基于最大信息系数MIC的铁路事故预警方法。基于最大信息系数MIC,对相关影响因素按照相关性程度进行排序,利用人工神经网络模型,得到不同数量影响因素情况下的拟合曲线,由此得到目标因素与影响因素之间的最优拟合曲线。在此基础上,给出危险区域的概念,提出了一种铁路事故预警方法。当影响铁路安全的因素进入危险区域时,调整不正常影响因素指标,可以极大地避免铁路事故的发生。
[Abstract]:Railway transportation plays an important role in the entire transportation system, with large-scale construction of China's railway, the railway transportation has entered a leapfrog stage of rapid development, the railway operating mileage increasing, freight and passenger traffic increased. However, at the same time, heavy, large railway accidents still happen occasionally, which caused great losses to people's life and property safety, ensure the safety of railway transportation is still an important part of the railway transportation. At present, a variety of advanced electronic and electrical equipment constantly applied to the railway system, railway safety influence factors more and more. In the face of so many influence factors of railway safety, first need to analyze the correlation between these factors, compared with other statistical correlation coefficient, maximum information coefficient (theMaximalInformationCoefficient, MIC) has good properties: wide (Generality) and evenness (Equit Ability), MIC can find different types of relationships. This paper analyses the definition of the two variable maximum information coefficient MIC proposed by Reshef et al and its approximation algorithm, for its shortcomings, proposes a fast algorithm for calculation of large-scale data in two variables and multi variable coefficient MIC and the maximum information, based on the maximum information coefficient MIC. The railway accident analysis and early warning research. Specifically, the main innovations are as follows:.1. proposes a mathematical programming model to calculate the two variable maximum information coefficient MIC and designed a fast algorithm for large-scale data. Through the definition of the two variable maximum information coefficient MIC analysis proposed by Reshef et al, the solution of the two variable maximum information the coefficients of MIC target and constraints, the mathematical model of planning are given; according to Reshef et al. Proposed the calculation of two variable maximum information system MIC Approximation algorithm for computing time, using k- means clustering algorithm, are divided respectively to two variables and two variables divided by the lattice algorithm, calculation of the two variables in large-scale data maximum information coefficient MIC. Numerical experiments show that the variable coefficient MIC two maximum information calculated by the fast algorithm proposed in this paper retained two excellent MIC properties: universality and uniformity; the computation time of two different types of variable correlation coefficient MIC is very close to the maximum information, and, with the increasing size of the data, the calculating time of the growth rate is not fast; analyzes the time complexity of the algorithm, the approximate algorithm proposed by Reshef et al. The time complexity is O (n2.4), this paper presents fast algorithms of time complexity is O (n1.6), a fast algorithm is proposed in this paper is more suitable for the excavation of two variables related to large-scale data in .2. defines a multivariate maximum information coefficient MIC, and proposes a fast algorithm for calculation of large-scale data in multi variable maximum information coefficient MIC. By using the chain rule of mutual information, the multivariate mutual information between a variable and decomposed into multiple variables and mutual information, which will be divided into multiple variables for the two part variables and independent variables, defined by multivariate maximum information coefficient MIC. Two using k- means clustering algorithm, the independent and dependent variables are divided into different number of blocks, proposes a fast algorithm for calculation of large-scale data in multi variable maximum information system number MIC. The numerical results show that the maximum coefficient of multivariate information MIC obtained a fast algorithm is proposed to maintain the superior properties of MIC: universality and uniformity, and the computation time is short, the computation time grows slower, suitable for the fast algorithm is proposed in this paper To explore large data in multivariate correlation.3. proposed complex network model of maximum information coefficient of MIC railway accidents based on accident factors. As a network node, between two points according to the maximum information coefficient MIC value generated edges in the network, analyzes the different levels of the dependent network structure changes, analyzes the degree of network nodes, degree distribution, outlier, change graph and network connectivity. The average index of a fixed factor, along with the increasing dependence of the level, the important factors influencing factors can be identified as.4. proposed a railway accident early warning method of maximum information coefficient based on MIC. The maximum information coefficient MIC based on the related influencing factors according to the correlation of the sort, using artificial neural network model, get the fitting curves under different number of factors affected by the The optimal fitting curve between the target and the influence factors of factors. On this basis, the concept is given the danger zone, proposed a railway accident early warning method. When the influence factors of railway safety into the danger area, adjust the abnormal factors, can greatly avoid the railway accident.
【学位授予单位】:北京交通大学
【学位级别】:博士
【学位授予年份】:2016
【分类号】:U298.5;TP301.6
【参考文献】
相关期刊论文 前10条
1 邵福波;李克平;;A Complex Network Model for Analyzing Railway Accidents Based on the Maximal Information Coefficient[J];Communications in Theoretical Physics;2016年10期
2 梁吉业;冯晨娇;宋鹏;;大数据相关分析综述[J];计算机学报;2016年01期
3 樊嵘;孟大志;徐大舜;;统计相关性分析方法研究进展[J];数学建模及其应用;2014年01期
4 马欣;李克平;罗自炎;周进;;Analyzing the causation of a railway accident based on a complex network[J];Chinese Physics B;2014年02期
5 ;中华人民共和国铁道部2011年铁道统计公报[J];中国铁路;2012年05期
6 李博;马云东;;铁路行车事故加权马尔可夫SCGM(1,1)_c预测模型及应用[J];安全与环境学报;2011年04期
7 ;中华人民共和国铁道部2010年铁道统计公报[J];中国铁路;2011年06期
8 王卓;贾利民;秦勇;杨凯淳;;铁路行车事故预测方法分析与比较[J];中国安全科学学报;2009年08期
9 张殿业,金键,杨京帅;铁路运输安全理论与技术体系[J];中国铁道科学;2005年03期
10 高自友,吴建军,毛保华,黄海军;交通运输网络复杂性及其相关问题的研究[J];交通运输系统工程与信息;2005年02期
相关会议论文 前1条
1 王予平;张长生;陈志雄;;基于事故致因模型的铁路行车安全研究[A];第八届中国智能交通年会优秀论文集——轨道交通[C];2013年
相关重要报纸文章 前1条
1 ;中华人民共和国铁道部2012年铁道统计公报[N];人民铁道;2013年
相关博士学位论文 前3条
1 张光远;高速铁路行车安全机理及相关应用问题研究[D];西南交通大学;2010年
2 吴娟;Copula理论与相关性分析[D];华中科技大学;2009年
3 吴建军;城市交通网络拓扑结构复杂性研究[D];北京交通大学;2008年
相关硕士学位论文 前3条
1 辛汇文;铁路事故致因建模分析研究[D];北京交通大学;2016年
2 张思帅;基于耗散结构的高速铁路运营事故演化机理[D];北京交通大学;2011年
3 门金勇;铁路调车人因事故的控制与管理研究[D];清华大学;2008年
,本文编号:1545358
本文链接:https://www.wllwen.com/shoufeilunwen/xxkjbs/1545358.html