基于机器学习的入侵检测和告警关联关键技术研究
本文选题:入侵检测 切入点:特征降维 出处:《北京邮电大学》2016年博士论文 论文类型:学位论文
【摘要】:网络技术在人们工作生活中的应用不断深化,互联网已成为承载海量数据信息的重要基础设施,在为人们带来巨大便利的同时,网络攻击如影随形,网络安全面临严重威胁。入侵检测和告警关联是网络安全技术体系中的重要组成部分,入侵检测能够通过收集和分析相关网络数据及时发现攻击行为,降低安全威胁,告警关联能够对多源信息进行融合分析,扩大入侵检测范围,提高告警质量。随着网络规模的扩大和网络攻击技术的多样化、复杂化发展趋势,待分析数据的维度和数量不断增长,传统的入侵检测和告警关联分析方法在处理海量高维数据方面面临巨大挑战。本文结合机器学习相关技术,以提高入侵检测性能和告警关联自动化程度为目标,在特征降维、数据流分类、异常检测和关联规则生成等方面开展研究,取得了一定的创新成果,主要研究工作如下:1.针对入侵检测过程中处理海量高维数据费时费力、实时性不高的问题,结合粗糙集理论和主成分分析方法对特征降维开展研究。特征降维的目标是在不降低数据分类能力和表达能力的前提下减少特征维数、提高数据分析效率。本文结合粗糙集理论和主成分分析提出一种新的特征降维方法,利用区分矩阵和信息熵完成特征选择,构造加权核函数完成特征映射和特征提取,结合两种方法对原始数据特征进行多层次深度提取,获取更为简洁的高级特征表示,提高入侵检测的实时性。2.分类是误用检测中经常用到的技术,通常利用标记数据完成分类模型的训练,待分析数据的动态数据流特性以及标记数据获取代价高的特点给传统方法带来了挑战。针对该问题,本文提出一种基于判决反馈的数据流分类方法,首先基于集成学习方法,利用数据流中的标记数据块训练初始分类模型,并利用该模型对无标记数据类型进行初始判决,然后结合该判决结果训练基于无标记数据的聚类模型,为数据分类提供约束信息,从而可将基于有监督方式的集成分类模型扩展为半监督方式,并基于模型一致性最大化的原则完成数据类型的精确判断,达到利用无标记数据改善数据分类性能的目的。3.异常检测通过建立正常用户行为轮廓模型去判断网络入侵等异常行为,实际环境中正常行为数据集的纯净度和完备性很难保证,从而影响异常检测模型的性能。针对该问题,本文结合主动学习提出一种基于半监督方式的增强式单分类支持向量机异常检测模型,该方法首先利用单分类支持向量机以无监督方式建立异常检测模型,然后结合主动学习的方法选取少量数据进行标记,利用标记数据信息将模型扩展为基于半监督方式的单分类支持向量机模型,并对主动学习的选择策略和终止条件进行了修正以兼顾数据纯净度和完备性需求,从而以较小的标记代价获取较大的异常检测性能提升。4.告警关联是网络安全领域研究热点之一,通过预定义规则指令对安全设备上报的事件进行关联分析,揭示隐藏在离散事件背后的有意义的联系,该领域的研究多集中在关联方法和规则表示上,而关联规则的获取更新多依赖于人工干预,从而限制了该方法的自适应性。针对该问题,本文提出了基于神经网络和遗传编程的关联规则生成方法,该方法首先利用神经网络模型完成基于攻击场景的事件分类,根据分类结果提取规则项并产生训练集,然后结合遗传编程生成关联规则并进行优化,完成关联规则的自动生成和更新,从而提升关联分析方法的自动化程度和自适应能力。综上所述,基于网络攻击日益复杂化和多样化的背景,针对当前入侵检测和告警关联方法面临的挑战,本文基于机器学习方法从特征提取、数据分类、异常检测和关联规则生成等方面进行了深入研究,提出了解决方案,并通过实验验证其可行性和准确性。本文研究成果有利于提高入侵检测的效率和准确性,提升关联分析的自动化程度和自适应能力,帮助人们从海量数据中更为实时准确地感知潜在威胁。
[Abstract]:The application of network technology in the work and life of the people is deepening, the Internet has become an important infrastructure carrying huge amounts of data, brings great convenience for people, network attacks, network security is facing a serious threat as the shadow follows the form,. Intrusion detection and alarm correlation is an important part in the system of network security technology, intrusion detection can collect and analysis of network data to detect attacks, reduce security threats, alarm correlation analysis of multi-source information fusion can expand the scope of intrusion detection, alarm, improve quality. With the diversification of network scale and the network attack technology, complex trend, dimension and quantity of data to be analyzed is growing, the traditional intrusion detection and alarm correlation analysis method is facing great challenges in the treatment of massive high-dimensional data based on machine learning. The related technology, in order to improve the performance of intrusion detection and alarm correlation degree of automation as the goal, in dimension reduction, data stream classification, to carry out the research on anomaly detection and association rule generation, has made some innovations, the main research work is as follows: 1. for the massive high-dimensional data processing time-consuming intrusion detection process, problem the real-time is not high, combined with principal component analysis theory and method of feature research of dimensionality reduction in rough set. The goal is to reduce the dimension of the feature without reducing the classification ability of data and skills under the premise of reducing dimension, improve the efficiency of data analysis. This paper combines the theory and principal component analysis, put forward a new feature reduction a method of rough set discernibility matrix and information entropy feature selection, weighted kernel function feature mapping and feature extraction of the original data, the features of multilayer combination of the two methods Time depth extraction, to obtain a more concise representation of advanced features to improve the real-time performance of.2., the classification of intrusion detection is often used in the detection of misuse of technology, usually by marking the data classification model training, to analysis of the dynamic data flow characteristics and labeled data to replace expensive features won a challenge. For the traditional method this problem, this paper proposes a flow classification method of decision feedback based on the data, based on the ensemble learning method, using the labeled data in the data stream block training initial classification model, and the initial judgment on unlabeled data types by using the model, and then combined with the judgment result of training unlabeled data clustering model based on constraint information for data classification, which can be based on the supervised classification model is extended to semi supervised methods, and based on the model of maximum consistency The principle accurately determine the type of data, achieve anomaly detection by establishing normal user behavior profile model to judge the network intrusion abnormal behavior without the use of labeled data to improve the classification performance data to.3., the purity and completeness of the normal behavior of the actual environment data set is very difficult to guarantee, which influences the performance of anomaly detection model for this. In this paper, the active learning provides an enhanced single SVM anomaly detection model based on semi supervised methods, using the method of single support vector machine classification based on unsupervised mode anomaly detection model, and then combined with the active learning methods are selected and labeled with a small amount of data, using labeled data information model is extended to vector machine model supports single semi supervised classification based on the way, and the selection strategy of active learning and termination conditions are modified In order to balance the purity and completeness of data demand, so as to obtain larger anomaly detection performance of.4. alert correlation is one of the hot research field of network security with less marked price, the correlation analysis of safety equipment for reporting events by predefined rules instructions, revealing the hidden in the discrete events behind the meaningful connections, much research in this field the association method and rule representation, and association rules to get updates depends on the manual intervention, thus limiting the adaptability of the method. Aiming at this problem, proposed by association rules and genetic programming network generation method based on God, using the method of neural network model to complete classification of attack scenarios based on event according to the classification, extraction rules and produce the training set, and then combined with genetic programming to generate association rules and optimize the complete Association The rules are automatically generated and updated, so as to enhance the degree of automation of the correlation analysis method and adaptive ability. To sum up, network attack has become increasingly complicated and diversified based on the background, in view of the current intrusion detection and alarm correlation method challenges the feature extraction, machine learning method based on data classification, in-depth research on anomaly detection and correlation rule generation and other aspects, proposed solutions, and its feasibility and accuracy are verified by experiments. The results of this study can improve the accuracy and efficiency of intrusion detection, lifting correlation analysis automation and adaptive ability, help people from massive data more accurately perceive the potential threat.
【学位授予单位】:北京邮电大学
【学位级别】:博士
【学位授予年份】:2016
【分类号】:TP393.08;TP181
【参考文献】
相关期刊论文 前10条
1 阳时来;杨雅辉;沈晴霓;黄海珍;;一种基于半监督GHSOM的入侵检测方法[J];计算机研究与发展;2013年11期
2 张玲;白中英;罗守山;谢康;崔冠宁;孙茂华;;基于粗糙集和人工免疫的集成入侵检测模型[J];通信学报;2013年09期
3 钱叶魁;陈鸣;叶立新;刘凤荣;朱少卫;张晗;;基于多尺度主成分分析的全网络异常检测方法[J];软件学报;2012年02期
4 朱永宣;单莘;郭军;;入侵检测系统中基于PCA和C-SSGA的双向数据压缩[J];哈尔滨工业大学学报;2009年09期
5 张昊;陶然;李志勇;蔡镇河;;基于KNN算法及禁忌搜索算法的特征选择方法在入侵检测中的应用研究[J];电子学报;2009年07期
6 黎铭;周志华;;基于多核集成的在线半监督学习方法[J];计算机研究与发展;2008年12期
7 龙军;殷建平;祝恩;赵文涛;;针对入侵检测的代价敏感主动学习算法[J];南京大学学报(自然科学版);2008年05期
8 李洋;方滨兴;郭莉;陈友;;基于直推式方法的网络异常检测方法[J];软件学报;2007年10期
9 李洋;方滨兴;郭莉;田志宏;;基于主动学习和TCM-KNN方法的有指导入侵检测技术[J];计算机学报;2007年08期
10 陈友;程学旗;李洋;戴磊;;基于特征选择的轻量级入侵检测系统[J];软件学报;2007年07期
相关博士学位论文 前2条
1 姚远;海量动态数据流分类方法研究[D];大连理工大学;2013年
2 郑黎明;大规模通信网络流量异常检测与优化关键技术研究[D];国防科学技术大学;2012年
,本文编号:1580412
本文链接:https://www.wllwen.com/guanlilunwen/ydhl/1580412.html