基于异常权值和子空间聚类的无监督网络异常流量检测研究
发布时间:2018-09-07 22:08
【摘要】:随着信息技术和网络技术飞速发展,我们从网络上获取信息资源变得更为丰富,便捷的交流方式极大地缩小了人与人之间的距离,但与此同时,这也给我们计算机安全方面带来了极大的威胁,信息网络安全问题的重要性也逐渐凸显出来。及时有效的发现网络中的攻击或异常行为已经成为了网络安全领域中的一个非常重要的课题。传统的网络异常入侵检测算法一般需要用已打标的数据库来训练模型,而这些标记数据库在实际网络环境中获取成本较高,且对于未训练过的新出现的异常数据流量束手无策。数据挖掘是一种十分常用的数据处理技术,可以从大量的数据中挖掘出潜在的符合事实的规则或知识。数据挖掘中的聚类是一种较好的无监督的学习方法,直接在无标签的数据集上建立检测模型,用以发现已知或未知的异常数据,因此无监督聚类经常与网络异常流量检测技术相结合。基于以上相关研究背景,本文在分析实际网络环境流量的基础上,采用了基于熵知识的数据特征提取方法,有效地降低了实时网络原数据的复杂度。在密度峰值聚类算法的基础上,创新地提出了基于密度的异常权值度量方法,进而构建出一种新的基于密度异常权值和子空间聚类的无监督异常流量检测模型,计算在每个子空间上流量的异常权值并排序后得出最终异常流量,避免了聚类完成后才能检测的方式,从而极大地降低了计算复杂度;同时也提出了另一种基于距离的异常权值度量方法,并在此基础上与K-means聚类算法结合构建出新的无监督异常流量检测模型。这两种方法都克服了传统网络异常流量检测模型的对于标记数据集的依赖,较大地提高了实时异常流量的准确率和查全率,同时也显著地降低了检测时间。最后在真实环境中的某信息安全公司内网数据集上和模拟数据集KDD Cup99上对检测模型进行实验分析验证,结果表明提出的检测模型对于提高检测准确率和降低误检率均有显著的效果。
[Abstract]:With the rapid development of information technology and network technology, we get more information resources from the network, and the convenient way of communication has greatly reduced the distance between people, but at the same time, This also brings great threat to our computer security, and the importance of information network security becomes more and more important. It has become a very important topic in the field of network security to detect attacks or abnormal behaviors in network in time and effectively. The traditional network anomaly intrusion detection algorithms generally need to use marked databases to train the model, but these tagged databases are expensive to obtain in the actual network environment, and there is no way to deal with the untrained new abnormal data flow. Data mining is a very common data processing technology, which can extract the rules or knowledge from a large amount of data. Clustering in data mining is a better unsupervised learning method, which directly builds detection model on untagged data sets to find known or unknown abnormal data. Therefore, unsupervised clustering is often combined with network anomaly detection technology. Based on the above research background, based on the analysis of the actual network traffic, this paper adopts the method of feature extraction based on entropy knowledge, which effectively reduces the complexity of the original data of real-time network. Based on the density peak clustering algorithm, a new density based outlier weight measurement method is proposed, and a new unsupervised anomaly flow detection model based on density anomaly weight and subspace clustering is constructed. The outlier weight of traffic on each subspace is calculated and sorted to get the final abnormal flow, which avoids the detection method after clustering is completed, thus greatly reducing the computational complexity. At the same time, another method of outlier weight measurement based on distance is proposed, and a new unsupervised anomaly flow detection model is constructed by combining with K-means clustering algorithm. These two methods can overcome the dependence of the traditional network anomaly traffic detection model on the marked data set and greatly improve the accuracy and recall of real-time abnormal traffic. At the same time the detection time is significantly reduced. Finally, the detection model is tested and verified on the data set of a certain information security company and the simulated data set KDD Cup99 in the real environment. The results show that the proposed detection model can improve the detection accuracy and reduce the false detection rate.
【学位授予单位】:重庆邮电大学
【学位级别】:硕士
【学位授予年份】:2016
【分类号】:TP393.08
本文编号:2229551
[Abstract]:With the rapid development of information technology and network technology, we get more information resources from the network, and the convenient way of communication has greatly reduced the distance between people, but at the same time, This also brings great threat to our computer security, and the importance of information network security becomes more and more important. It has become a very important topic in the field of network security to detect attacks or abnormal behaviors in network in time and effectively. The traditional network anomaly intrusion detection algorithms generally need to use marked databases to train the model, but these tagged databases are expensive to obtain in the actual network environment, and there is no way to deal with the untrained new abnormal data flow. Data mining is a very common data processing technology, which can extract the rules or knowledge from a large amount of data. Clustering in data mining is a better unsupervised learning method, which directly builds detection model on untagged data sets to find known or unknown abnormal data. Therefore, unsupervised clustering is often combined with network anomaly detection technology. Based on the above research background, based on the analysis of the actual network traffic, this paper adopts the method of feature extraction based on entropy knowledge, which effectively reduces the complexity of the original data of real-time network. Based on the density peak clustering algorithm, a new density based outlier weight measurement method is proposed, and a new unsupervised anomaly flow detection model based on density anomaly weight and subspace clustering is constructed. The outlier weight of traffic on each subspace is calculated and sorted to get the final abnormal flow, which avoids the detection method after clustering is completed, thus greatly reducing the computational complexity. At the same time, another method of outlier weight measurement based on distance is proposed, and a new unsupervised anomaly flow detection model is constructed by combining with K-means clustering algorithm. These two methods can overcome the dependence of the traditional network anomaly traffic detection model on the marked data set and greatly improve the accuracy and recall of real-time abnormal traffic. At the same time the detection time is significantly reduced. Finally, the detection model is tested and verified on the data set of a certain information security company and the simulated data set KDD Cup99 in the real environment. The results show that the proposed detection model can improve the detection accuracy and reduce the false detection rate.
【学位授予单位】:重庆邮电大学
【学位级别】:硕士
【学位授予年份】:2016
【分类号】:TP393.08
【参考文献】
相关期刊论文 前4条
1 林果园;曹天杰;;入侵检测系统研究综述[J];计算机应用与软件;2009年03期
2 胡_g;李智玲;李春伟;;一种基于区分矩阵的属性约简算法[J];计算机工程与应用;2007年09期
3 罗敏,王丽娜,张焕国;基于无监督聚类的入侵检测方法[J];电子学报;2003年11期
4 李辉,管晓宏,昝鑫,韩崇昭;基于支持向量机的网络入侵检测[J];计算机研究与发展;2003年06期
,本文编号:2229551
本文链接:https://www.wllwen.com/guanlilunwen/ydhl/2229551.html