当前位置:主页 > 科技论文 > 软件论文 >

基于数据挖掘的异常流量分析与检测

发布时间:2018-10-11 10:34
【摘要】:随着互联网络的飞速发展、网络的规模及其所承载的业务类型日益增多。虽然互联网的发展给人们带来了很大的方便,但是,网络出现异常情况的机会也随之增大。如何准确、快速地检测出网络中的异常流量并做出及时、合理的响应具有重要的实际意义和应用价值。近几年,研究者们提出了基于数据挖掘的异常流量检测方法,从海量数据中自动地发现隐含的、有用的知识,形成检测规则,从而发现异常情况。针对这些内容,学者们进行了广泛的研究。首先,本论文通过广泛的调研对国内外异常流量检测与分析的技术发展和现状有了一定的了解。然后对异常流量定义及其分类、异常检测方法进行概述,并对主流的流量检测和异常流量检测技术进行详细的分析和对比,根据其原理,对其优点与不足进行说明。其次,本文对数据挖掘算法中的聚类算法进行了研究,将基于密度的DBSCAN算法用于异常流量的检测。采用改进的基于网格的DBSCAN聚类方法对离线数据集进行训练与测试,得到异常流量特征趋势,区分出哪些是正常行为,哪些是异常行为。此方法可以发现任意形状、不同大小的簇并有效地识别边界点和去除噪声点,使得聚类结果更加精准,同时执行效率也有所提高。再次,本文对异常流量分类的方法进行了研究。运用交叉熵理论来度量流量特征的分布变化,当出现异常行为时,会使得两个连续观测点之间的交叉熵突然增大。本文使用源IP地址、目的IP地址、源端口、目的端口、流大小、入度、出度、包数目8项特征属性的交叉熵来对网络异常流量进行分类。定义蠕虫病毒、DoS攻击、DDoS攻击、端口扫描攻击、异常P2P流量5种异常流量的属性特征,采用欧式距离判断攻击类型。此方法能根据异常流量的特征将异常流量分类,使得分类结果准确度有所提高。最后,本文通过离线数据集KDD 99以及基于网格的DBSCAN算法和交叉熵理论进行异常流量监测的模型建立,采用基于NetFlow形式的网络流进行流量数据的采集,对模拟实时流量进行检测与分析,为日后能迅速排查网络异常、找准异常原因、提供解决方案提供检测依据。
[Abstract]:With the rapid development of the Internet, the scale of the network and the types of business carried by it are increasing day by day. Although the development of the Internet has brought great convenience to people, the chance of network anomaly also increases. How to accurately and quickly detect the abnormal traffic in the network and make timely and reasonable response has important practical significance and application value. In recent years, researchers have proposed a method of anomaly traffic detection based on data mining, which can automatically find hidden and useful knowledge from massive data and form detection rules. In view of these contents, scholars have carried out extensive research. First of all, through extensive research, this paper has a certain understanding of the technical development and current situation of abnormal traffic detection and analysis at home and abroad. Then, the definition and classification of abnormal traffic, the methods of anomaly detection are summarized, and the main flow detection and abnormal flow detection techniques are analyzed and compared in detail. According to its principle, the advantages and disadvantages are explained. Secondly, the clustering algorithm of data mining algorithm is studied in this paper, and the density-based DBSCAN algorithm is used to detect abnormal traffic. An improved grid-based DBSCAN clustering method is used to train and test off-line data sets to obtain the trend of abnormal traffic characteristics and to distinguish which is normal behavior and which is abnormal behavior. This method can find clusters of arbitrary shapes and sizes and effectively identify boundary points and remove noise points, so that the clustering results are more accurate and the execution efficiency is also improved. Thirdly, the method of abnormal traffic classification is studied in this paper. The cross-entropy theory is used to measure the distribution of traffic characteristics. When abnormal behavior occurs, the cross-entropy between two continuous observation points increases suddenly. In this paper, the cross-entropy of eight characteristic attributes of source IP address, destination IP address, source port, destination port, stream size, incoming degree, outlier and number of packets is used to classify the network abnormal traffic. The attribute characteristics of 5 kinds of abnormal traffic such as worm, DoS attack, DDoS attack, port scan attack and abnormal P2P traffic are defined, and Euclidean distance is used to judge the attack type. This method can classify the abnormal traffic according to the characteristics of the abnormal traffic, and improve the accuracy of the classification results. Finally, the model of abnormal traffic monitoring is established by off-line data set KDD 99, grid-based DBSCAN algorithm and cross-entropy theory, and the network flow based on NetFlow is used to collect traffic data. The detection and analysis of simulated real-time traffic can provide a basis for detecting network anomalies quickly, finding out the causes of anomalies and providing solutions.
【学位授予单位】:北京邮电大学
【学位级别】:硕士
【学位授予年份】:2016
【分类号】:TP311.13;TP393.06

【参考文献】

相关期刊论文 前10条

1 姚武军;魏彬;;基于贝叶斯树和集成学习的异常检测[J];武汉大学学报(理学版);2014年06期

2 符啸威;;基于Netflow技术的互联网流量流向的分析与研究[J];中国现代教育装备;2012年09期

3 郭保青;朱力强;史红梅;;基于快速DBSCAN聚类的铁路异物侵限检测算法[J];仪器仪表学报;2012年02期

4 陈锶奇;王娟;;基于信息熵理论的教育网异常流量发现[J];计算机应用研究;2010年04期

5 许晓东;卞鹏;朱士瑞;;基于Netflow的异常流量分离以及归类[J];计算机工程与设计;2009年21期

6 何震凯;阳爱民;刘永定;邱密;;一种使用DBSCAN聚类的网络流量分类方法[J];计算机应用研究;2009年09期

7 魏桂英;姜亚星;;基于流数据挖掘的网络流量异常检测及分析研究[J];中国管理信息化;2009年15期

8 冯少荣;肖文俊;;一种提高DBSCAN聚类算法质量的新方法[J];西安电子科技大学学报;2008年03期

9 冯少荣;肖文俊;;DBSCAN聚类算法的研究与改进[J];中国矿业大学学报;2008年01期

10 徐兴元;傅和平;熊中朝;;基于数据挖掘的入侵检测技术研究[J];微计算机信息;2007年09期

相关博士学位论文 前1条

1 韦安明;互联网中基于流测量的P2P流量及异常事件检测[D];北京邮电大学;2007年

相关硕士学位论文 前6条

1 严晋如;基于关键元素的流量矩阵分析研究[D];华中科技大学;2012年

2 陈鹏;数据流关联规则挖掘研究及其应用[D];浙江大学;2011年

3 毛敬玉;基于Data Mining的网络异常流量检测系统的研究[D];兰州大学;2007年

4 杨政安;基于数据挖掘的网络流量异常检测系统研究[D];天津大学;2007年

5 陈婷婷;基于数据流的网络流量突发异常检测[D];哈尔滨工业大学;2006年

6 应建波;数据挖掘技术在网络流量异常检测中的应用研究[D];华中科技大学;2006年



本文编号:2263878

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/2263878.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户ebd98***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com