网络入侵数据聚类分析研究

发布时间：2018-04-28 14:26

本文选题：入侵检测 + 聚类分析　；参考：《深圳大学》2015年硕士论文

【摘要】：近年网络技术飞速发展,网络安全问题也变得愈发突出,为了更好的应对安全问题很多学者对入侵检测技术进行研究,以期入侵检测系统能进一步的保障我们所处网络的安全。入侵检测系统的核心部分在于入侵分析模块,目前对于入侵分析模块所采用的分析技术的研究可谓“百花齐放”,入侵分析可直观的看成是一个数据挖掘的过程,而聚类分析技术可以对海量的网络数据进行知识挖掘,能较好的应用于入侵行为的识别分析中,现在也已经被广泛的应用在入侵检测系统中。本文中将具体的聚类分析方法与入侵检测相结合,对经典的K-means、Fuzzy ART、Kohonen聚类算法进行深入研究,分析这三种算法的特点和不足,针对这几个算法存在的问题提出了两种较优的改进算法,并将改进算法用于网络入侵数据的检测中,最后实验仿真比较改进算法用于入侵检测的效果。论文主要工作内容有如下几点:(1)从KDD CUP99数据集中提取实验数据。KDD CUP99数据集是用于入侵分析的标准数据集,很多学者对于入侵检测的研究都是基于该数据集,本文所使用的其中一组实验数据来源于该数据集,本文深入地研究了KDD CUP99数据集并通过主成分分析法从中提取降了维的入侵数据,得到的降维数据仍保留了原始数据的主要信息。(2)提出基于Fuzzy ART的改进K-means算法。利用Fuzzy ART聚类过程中能自动生成新节点的特性,对原始数据进行初步的聚类,为K-means提供符合数据分布的类中心和类个数K。(3)改进Kohonen网络学习的权值调整方式。在传统Kohonen网络的学习过程中引入隶属度,基于隶属度的方式进行获胜领域神经元学习,改进的学习方式使得神经元的学习更能反映样本的特性。(4)实验分析。用传统Fuzzy ART、K-means及改进的FART K-means算法在两组不同的标准网络入侵数据集上进行对比实验,结果表明改进的FART K-means算法在检测准确率和聚类速度上都有一定程度的提高。同样,使用传统Kohonen和改进的I-Kohonen算法进行仿真对比实验,结果表明改进的I-Kohonen算法对入侵数据的检测能在保持运行速度的情况下提高检测率。本文提出的两种改进算法应用在入侵数据聚类分析中都取得了较满意的结果,能较好的完成对入侵数据的检测。整个论文的创新点主要有两点:(1)改进了K-means算法的K值选取方法和中心选择方法;(2)优化了Kohonen网络的权值学习方式。
[Abstract]:In recent years, with the rapid development of network technology, network security issues have become more and more prominent. In order to better deal with security problems, many scholars study intrusion detection technology in order to further ensure the security of our network. The core part of the intrusion detection system is the intrusion analysis module. At present, the research on the analysis technology used in the intrusion analysis module can be described as "a hundred flowers blossom", and the intrusion analysis can be viewed as a process of data mining. Clustering analysis technology can be used for knowledge mining of massive network data, and can be applied to intrusion identification and analysis. Now it has been widely used in intrusion detection system. In this paper, the classical K-means-fuzzy ARTN Kohonen clustering algorithm is deeply studied by combining the specific clustering analysis method with the intrusion detection method, and the characteristics and shortcomings of the three algorithms are analyzed. Aiming at the problems of these algorithms, two improved algorithms are put forward, and the improved algorithms are applied to the detection of network intrusion data. Finally, the effect of the improved algorithm in intrusion detection is compared by simulation. The main work of this paper is as follows: 1) extracting experimental data from KDD CUP99 dataset. KDD CUP99 dataset is a standard data set for intrusion analysis. One of the experimental data used in this paper is derived from the data set. In this paper, the KDD CUP99 data set is deeply studied and the dimensionally reduced intrusion data is extracted by principal component analysis (PCA). The obtained dimensionality reduction data still retains the main information of the original data. (2) an improved K-means algorithm based on Fuzzy ART is proposed. Taking advantage of the feature that new nodes can be generated automatically in the process of Fuzzy ART clustering, the primary clustering of raw data is carried out, which provides K-means with a class center that accords with data distribution and the number of classes K. ~ (3) and improves the weight adjustment method of Kohonen network learning. Membership degree is introduced into the learning process of traditional Kohonen network, and neuron learning in winning domain is carried out based on membership degree. The improved learning method makes neuron learning more reflective of the characteristics of the sample. The traditional Fuzzy ART K-means and the improved FART K-means algorithm are compared on two sets of standard network intrusion data sets. The results show that the improved FART K-means algorithm can improve the detection accuracy and clustering speed to a certain extent. In the same way, the traditional Kohonen algorithm and the improved I-Kohonen algorithm are used to carry out the simulation and contrast experiments. The results show that the improved I-Kohonen algorithm can improve the detection rate of intrusion data under the condition of keeping the running speed. The two improved algorithms proposed in this paper have been applied to the clustering analysis of intrusion data with satisfactory results, and the intrusion data can be detected well. The main innovations of the whole paper are two points: 1) improving the K-means algorithm's K-value selection method and the center selection method / 2) optimizing the weight learning method of Kohonen network.
【学位授予单位】：深圳大学
【学位级别】：硕士
【学位授予年份】：2015
【分类号】：TP393.08

【参考文献】