基于重采样的级联分类器入侵检测研究
[Abstract]:With the rapid development of the information technology and the popularization of the network, the Internet has become an important part of people's work life, and meanwhile, the malicious information stealing, personal attack and illegal exploitation of the Internet in the Internet also increase, and the problem of network security is becoming more and more serious. The importance of network security research is becoming more and more prominent. Intrusion detection is a hot topic in the field of network security, and it is a process to detect the violation of safe use in computer network or system. With the development of information technology, the complexity of all kinds of computer systems also grows exponentially, which brings great difficulty to the intrusion detection. In this paper, through the research of the network intrusion detection method, it is found that the common intrusion detection method is mainly devoted to the improvement of the overall detection rate, but the detection rate of some important categories is ignored, such that the R2L (unauthorized access from the remote host) and the U2R (unauthorized local super-user privilege access) have a low detection rate, however, after the two types of behavior intrusion are successful, the server resources can be stolen or destroyed, It is very urgent to improve its detection performance. In this paper, the main causes of the two kinds of attack detection results of R2L and U2R are analyzed in this paper. The main cause of this paper is that the data distribution is not balanced, leading to the skew of the classification. It is an unbalanced classification problem (that is, the distribution of the training concentrated data is extremely unbalanced, the number of samples of one or some classes is far greater than or smaller than the other categories), and the other is that the two types of attacks are difficult to distinguish from the header, and the detailed content information of the data packet is required. Through the analysis and research of the common intrusion detection method, it is found that they all adopt the same method to detect various types, so it is difficult to achieve the ideal effect, and the cascade of multiple classifiers can effectively solve the problem of unbalanced data distribution in the intrusion detection. The intrusion detection is a typical non-equilibrium classification problem. In this paper, the non-equilibrium classification method such as re-sampling is deeply studied in this paper, and the method of NCL (neighborhood cleaning) filter is introduced to the problem of noise and boundary data in the process of re-sampling the intrusion detection data set by the SMOTE. An improved re-sampling method, SMOTE-NCL, is proposed to filter out the noise and boundary data. In this paper, the cascade classifier is used for intrusion detection due to the advantages of the cascade classifier method in solving the problem of unbalanced classification and the good effect in the intrusion detection. However, considering the influence of the feature dimension of the intrusion detection data set on the detection performance, this paper selects the feature subset for the cascaded classifier by introducing the improved optimized CGFR feature selection method. And then the CGFR and the SMOTE-NCL are applied to a cascade classifier, and on the basis of that, a cascade classifier intrusion detection model based on the re-sampling is proposed to solve the problem that the two types of attack detection effects of the R2L and U2R are not ideal in the prior intrusion detection method. according to the theoretical analysis experiment, the classification method in the cascade classifier selected by the invention is a decision tree algorithm (C4.5) and a Naive Bayes (NB) algorithm, and the first classifier of the model cascade is used for training a Do S (denial of service attack), Probe (port scan) and Normal (normal data), the second classifier is used to train three types of Normal, R2L and U2R; in the course of detection, the test set first enters the first classifier to be classified by the classifier as normal data into the second classifier, and finally can complete Do S, Probe, The classification of the Normal, R2L and U2R categories. In this paper, the classification results of the feature subsets selected by the feature selection method and the CGFR method on the cascade classifier are compared, and the results of the classification using the cascade classifier on the data set with different sampling rates of the original data set and the SMOTE and the SMOTE-NCL re-sampling are compared. Finally, the results of classification by using the SVM, KNN, NB, C4.5 and the cascade classifier method on the data set of the SMOTE-NCL re-sampling are compared, and the AUC values of the cascade classifier intrusion detection model based on CGFR and SMOTE-NCL are higher than that of other cases for both U2R and R2L attacks. However, the detection result of the R2L is still not ideal because the R2L class attack is difficult to distinguish by the packet header feature, and the detailed content characteristics of the data packet are required to determine that the large number of sample header features are not identical to Normal, so the detection effect is not ideal. To further solve this problem, the author considers that part of the feature should be extracted from the contents of the data packet when the data is extracted, and the training set and test set can be dynamically generated, which is also the work of the next step.
【学位授予单位】:西南大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP393.08
【参考文献】
相关期刊论文 前10条
1 龚俭;臧小东;苏琪;胡晓艳;徐杰;;网络安全态势感知综述[J];软件学报;2017年04期
2 李威;杨忠明;;入侵检测系统的研究综述[J];吉林大学学报(信息科学版);2016年05期
3 袁开银;费岚;;混合粒子群优化算法选择特征的网络入侵检测[J];吉林大学学报(理学版);2016年02期
4 江颉;王卓芳;陈铁明;朱陈晨;陈波;;自适应AP聚类算法及其在入侵检测中的应用[J];通信学报;2015年11期
5 武小年;彭小金;杨宇洋;方X;;入侵检测中基于SVM的两级特征选择方法[J];通信学报;2015年04期
6 崔亚芬;解男男;;一种基于特征选择的入侵检测方法[J];吉林大学学报(理学版);2015年01期
7 杨雅辉;黄海珍;沈晴霓;吴中海;张英;;基于增量式GHSOM神经网络模型的入侵检测研究[J];计算机学报;2014年05期
8 肖仙谦;朱俊平;景旭;马巧娥;;基于贝叶斯方法的单分类入侵检测技术[J];河北大学学报(自然科学版);2014年01期
9 付忠良;;多标签代价敏感分类集成学习算法[J];自动化学报;2014年06期
10 张玲;白中英;罗守山;谢康;崔冠宁;孙茂华;;基于粗糙集和人工免疫的集成入侵检测模型[J];通信学报;2013年09期
相关博士学位论文 前1条
1 刘运;DDoS Flooding攻击检测技术研究[D];国防科学技术大学;2011年
相关硕士学位论文 前3条
1 刘敏捷;基于组合学习和主动学习的蛋白质关系抽取[D];大连理工大学;2015年
2 张楠;数据挖掘在入侵检测中的应用研究[D];电子科技大学;2015年
3 陈明旺;面向不平衡数据的支持向量机方法在入侵检测中的应用与研究[D];南京大学;2011年
,本文编号:2438217
本文链接:https://www.wllwen.com/guanlilunwen/ydhl/2438217.html