融合杜鹃搜索的粒子群算法的P2P流量识别方法研究
发布时间:2019-02-14 12:21
【摘要】:随着互联网的发展,对等网络技术(Peer-to-Peer,简称P2P)得到了广泛使用,已经占据互联网业务总量50%以上。一方面给人们工作生活带来便利,另一方面P2P也带来网络拥塞,信息安全等难题。所以有必要对P2P流量进行管理和控制,因此实现对P2P流量识别的问题变成了关键。 P2P流量识别本质上是模式识别问题,其识别的准确性很大程度上取决于选择的流量特征和构建的分类器方法,本文主要围绕杜鹃搜索和粒子群算法在P2P流量特征选择和最优P2P分类器构建中的应用展开了深入研究,主要工作如下。 (1)融合杜鹃搜索的粒子群算法的P2P流量特征选择。在P2P流量识别问题中,通常单一特征识别率低,因此在实际工作中需要引入多种特征来提高流量识别率。虽然采用支持向量机(Support Vector machine,SVM)分类器能够克服维数灾难问题,但是过多的特征还是无法避免这个问题,而且会增加流量特征采样的工作量,导致识别算法识别效率下降,难以满足识别实时性的问题。因此,可以这里引入融合杜鹃搜索的粒子群算法的特征选择新方法,在众多特征集合中选择出具有最佳分类性能的特征子集,以提高识别算法识别精度和计算效率。 (2)基于融合杜鹃搜索的粒子群算法的P2P流量识别方法。SVM能够很好的解决传统的机器学习面临的过学习、欠学习,陷入局部最优解和维数灾难等问题,因此,,本文应用SVM进行P2P流量识别。然而SVM的惩罚参数和核函数以及核函数的参数在极大程度上影响了SVM的性能。实际操作中没有公认的参数调节方法,常用的参数调节方法要么计算费时如网格搜索,或者易陷入局部最优,如基于遗传算法的SVM参数优化。因此,本文采用融合杜鹃搜索的粒子群算法对支持向量机参数进行优化。 最后,对于本文所提出的特征选择和SVM参数优化方法,在机器学习UCI数据库和真实校园P2P数据上进行了测试,并和已有遗传算法,粒子群算法,杜鹃搜索算法等进行了实验对比。结果表明本文提出的融合杜鹃搜索的粒子群算法的特征选择算法能够获得优秀的特征子集,经过本文算法优化后的SVM也具有更好的识别性能。
[Abstract]:With the development of the Internet, Peer-to-Peer Network (Peer-to-Peer,) technology has been widely used, accounting for more than 50% of the total Internet services. On the one hand, it brings convenience to people's work and life, on the other hand, P2P also brings network congestion, information security and other problems. So it is necessary to manage and control P2P traffic, so the problem of P2P traffic identification becomes the key. P2P traffic recognition is essentially a pattern recognition problem. The accuracy of P2P traffic recognition depends to a great extent on the selected traffic characteristics and the constructed classifier method. This paper focuses on the application of rhododendron search and particle swarm optimization in P2P traffic feature selection and optimal P2P classifier construction. The main work is as follows. (1) P2P traffic feature selection based on particle swarm optimization (PSO) based on rhododendron search. In P2P traffic identification problem, the single feature recognition rate is usually low, so it is necessary to introduce a variety of features to improve the traffic identification rate in the actual work. Although the support vector machine (Support Vector machine,SVM) classifier can overcome the problem of dimensionality disaster, too many features can not avoid the problem, and the workload of traffic feature sampling will be increased, which will result in the reduction of recognition efficiency of the recognition algorithm. It is difficult to meet the problem of real-time recognition. Therefore, a new feature selection method based on the particle swarm optimization (PSO) combined with rhododendron search can be introduced here to select the feature subset with the best classification performance from many feature sets, so as to improve the recognition accuracy and computational efficiency of the recognition algorithm. (2) based on the particle swarm optimization algorithm of rhododendron search, SVM can solve the problems of overlearning, underlearning, falling into local optimal solution and dimensionality disaster in traditional machine learning. This paper uses SVM to identify P2P traffic. However, the penalty parameters and kernel functions and kernel function parameters of SVM greatly affect the performance of SVM. In practice, there is no recognized parameter adjustment method. The commonly used parameter adjustment methods are either time-consuming to compute, such as grid search, or easily fall into local optimum, such as SVM parameter optimization based on genetic algorithm. Therefore, the support vector machine (SVM) parameters are optimized by the particle swarm optimization (PSO) algorithm combined with rhododendron search. Finally, the methods of feature selection and SVM parameter optimization proposed in this paper are tested on the machine learning UCI database and real campus P2P data, and the existing genetic algorithm, particle swarm optimization algorithm, The algorithm of rhododendron search is compared. The results show that the feature selection algorithm of the particle swarm optimization algorithm combined with rhododendron search proposed in this paper can obtain an excellent feature subset, and the SVM optimized by this algorithm has better recognition performance.
【学位授予单位】:湖北工业大学
【学位级别】:硕士
【学位授予年份】:2014
【分类号】:TP393.02;TP18
本文编号:2422198
[Abstract]:With the development of the Internet, Peer-to-Peer Network (Peer-to-Peer,) technology has been widely used, accounting for more than 50% of the total Internet services. On the one hand, it brings convenience to people's work and life, on the other hand, P2P also brings network congestion, information security and other problems. So it is necessary to manage and control P2P traffic, so the problem of P2P traffic identification becomes the key. P2P traffic recognition is essentially a pattern recognition problem. The accuracy of P2P traffic recognition depends to a great extent on the selected traffic characteristics and the constructed classifier method. This paper focuses on the application of rhododendron search and particle swarm optimization in P2P traffic feature selection and optimal P2P classifier construction. The main work is as follows. (1) P2P traffic feature selection based on particle swarm optimization (PSO) based on rhododendron search. In P2P traffic identification problem, the single feature recognition rate is usually low, so it is necessary to introduce a variety of features to improve the traffic identification rate in the actual work. Although the support vector machine (Support Vector machine,SVM) classifier can overcome the problem of dimensionality disaster, too many features can not avoid the problem, and the workload of traffic feature sampling will be increased, which will result in the reduction of recognition efficiency of the recognition algorithm. It is difficult to meet the problem of real-time recognition. Therefore, a new feature selection method based on the particle swarm optimization (PSO) combined with rhododendron search can be introduced here to select the feature subset with the best classification performance from many feature sets, so as to improve the recognition accuracy and computational efficiency of the recognition algorithm. (2) based on the particle swarm optimization algorithm of rhododendron search, SVM can solve the problems of overlearning, underlearning, falling into local optimal solution and dimensionality disaster in traditional machine learning. This paper uses SVM to identify P2P traffic. However, the penalty parameters and kernel functions and kernel function parameters of SVM greatly affect the performance of SVM. In practice, there is no recognized parameter adjustment method. The commonly used parameter adjustment methods are either time-consuming to compute, such as grid search, or easily fall into local optimum, such as SVM parameter optimization based on genetic algorithm. Therefore, the support vector machine (SVM) parameters are optimized by the particle swarm optimization (PSO) algorithm combined with rhododendron search. Finally, the methods of feature selection and SVM parameter optimization proposed in this paper are tested on the machine learning UCI database and real campus P2P data, and the existing genetic algorithm, particle swarm optimization algorithm, The algorithm of rhododendron search is compared. The results show that the feature selection algorithm of the particle swarm optimization algorithm combined with rhododendron search proposed in this paper can obtain an excellent feature subset, and the SVM optimized by this algorithm has better recognition performance.
【学位授予单位】:湖北工业大学
【学位级别】:硕士
【学位授予年份】:2014
【分类号】:TP393.02;TP18
【参考文献】
相关期刊论文 前2条
1 孙玉芬;卢炎生;;流数据挖掘综述[J];计算机科学;2007年01期
2 叶志伟;郑肇葆;万幼川;虞欣;;基于蚁群优化的特征选择新方法[J];武汉大学学报(信息科学版);2007年12期
相关博士学位论文 前1条
1 刘衍民;粒子群算法的研究及应用[D];山东师范大学;2011年
本文编号:2422198
本文链接:https://www.wllwen.com/guanlilunwen/ydhl/2422198.html