当前位置:主页 > 科技论文 > 自动化论文 >

网络流量分类中特征工程的研究

发布时间:2018-06-25 22:16

  本文选题:网络流量分类 + 最小最大规则 ; 参考:《南京邮电大学》2017年硕士论文


【摘要】:对网络流量进行分析与分类是实现网络监控和管理的一大途径,并被广泛应用于网络入侵检测系统、网络管理系统等领域中。然而,随着动态端口号技术以及对流量加密技术的发展,单纯传统的网络流量分类方法已经无法达到我们对其准确性的要求。近年来,基于机器学习的网络流量分类受到广泛关注,其仅需定义一组与流量相关的统计量作为特征,而不需要使用端口号等来表示流量,从而避免了传统方法带来的局限性。然而,网络流量分类中存在着诸如类别不平衡、数据规模大等各种问题,若仅单纯使用传统的机器学习算法同样会导致分类性能较差。本文以此为出发点,对机器学习算法进行研究和改进,并用于网络流量分类中以提高其性能。本文提出一种基于最小最大策略的集成特征选择算法用于解决流量分类中遇到的类别不平衡问题。该算法是将机器学习中特征选择和集成学习相结合,主要分为两个步骤,即数据划分与特征选择结果集成。先通过某方法将原始数据集划分为若干数据子集,在对每个数据子集进行特征选择过后,再通过最小最大策略将每个数据子集的特征选择结果进行集成,得到最终的特征选择结果。本文通过将该算法与其他集成特征选择算法进行比较,主要验证其在网络流量分类中的性能。为了进一步提升网络流量分类的性能,本文通过考虑流量之间的相关性,在之前的流量数据集的基础上提取了一组基于多条流量在时间/空间上的关联性得到的特征,如与待分类流量拥有相同源IP地址的流量集合中流量的数量等。最后将提取了多流特征后的数据集使用提出的集成特征选择策略进行特征选择并进行分类以验证多流特征对网络流量分类效果的影响。实验表明,其在结合了部分多流特征之后,效果明显地提升。本文提出的集成特征选择算法能有效地处理流量分类中类别不平衡的问题。与此同时,提取的多流特征也对流量分类的性能有一定地提升。
[Abstract]:The analysis and classification of network traffic is a great way to realize network monitoring and management, and is widely used in network intrusion detection system, network management system and other fields. However, with the development of dynamic port number technology and traffic encryption technology, the traditional network traffic classification method can not meet the requirements of its accuracy. In recent years, network traffic classification based on machine learning has attracted much attention. It only needs to define a set of statistics related to traffic as a feature, and does not need to use port numbers to represent traffic, thus avoiding the limitations of traditional methods. However, there are many problems in network traffic classification, such as class imbalance, large data scale and so on. If we only use traditional machine learning algorithms, the classification performance will also be poor. This paper studies and improves the machine learning algorithm and applies it to network traffic classification to improve its performance. In this paper, an ensemble feature selection algorithm based on minimum maximum strategy is proposed to solve the class imbalance problem in traffic classification. The algorithm is a combination of feature selection and ensemble learning in machine learning, which is divided into two steps: data partitioning and feature selection result integration. First, the original data set is divided into several data subsets by a certain method. After the feature selection of each data subset is carried out, the feature selection results of each data subset are integrated by the minimum maximum strategy. The final feature selection results are obtained. By comparing the algorithm with other integrated feature selection algorithms, this paper mainly verifies its performance in network traffic classification. In order to further improve the performance of network traffic classification, by considering the correlation between traffic, we extract a set of features based on the correlation of multiple traffic in time / space based on the previous traffic data set. Such as the amount of traffic in the traffic set with the same source IP address as the traffic to be classified. Finally, the data set after extracting multi-stream features is selected and classified using the proposed integrated feature selection strategy to verify the effect of multi-flow features on network traffic classification. The experimental results show that the effect is improved obviously by combining partial multi-flow features. The integrated feature selection algorithm proposed in this paper can effectively deal with the problem of class imbalance in traffic classification. At the same time, the extracted multi-stream features also improve the performance of traffic classification.
【学位授予单位】:南京邮电大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP393.06;TP181

【参考文献】

相关期刊论文 前2条

1 刘珍;王若愚;蔡先发;唐德玉;;互联网流量分类中流量特征研究[J];计算机应用研究;2017年01期

2 林平;余循宜;刘芳;雷振明;;基于流统计特性的网络流量分类算法[J];北京邮电大学学报;2008年02期

相关博士学位论文 前1条

1 林平;网络流量的离线分析[D];北京邮电大学;2010年

相关硕士学位论文 前1条

1 周国静;基于最小最大规则的集成策略研究[D];南京邮电大学;2015年



本文编号:2067730

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/zidonghuakongzhilunwen/2067730.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户f2187***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com