当前位置:主页 > 管理论文 > 移动网络论文 >

基于无监督学习的P2P流量识别技术的研究

发布时间:2019-06-06 06:25
【摘要】:随着P2P网络技术的发展,P2P应用越来越广泛,而对P2P流量的识别是P2P技术研究者一直所追求的。由于应用越来越多,从而对P2P流量的识别也越来越困难。 本文从介绍P2P技术着手,分析了几种典型的P2P流量识别技术,从这些技术的优缺点中提出一种改进的算法,这种改进的算法是基于无监督学习的一种聚类算法。本文首先从数据包级和数据流级方面分析了P2P流量统计特征,从而选取了P2P流中包大小的平均方差值、P2P流所持续的时间、P2P流中包大小的变换率、P2P流中数据包的平均字节数、以及下载与上传速度比等五种适合本文算法实验的特征属性,以此作为后文DBK算法的实验验证。其次,本文简单介绍了K means算法以及DBSCAN算法的优缺点,,在此基础上加以改进,从而得到基于DBSCAN改进的K means算法(即DBK算法),并在算法初始点的寻找过程中加入贝叶斯信息准则,得到BIC核心点作为初始节点,再通过K means算法进行聚类。 最后,本文对DBK算法进行了实验,与K means算法和DBSCAN算法进行比较,从准确率以及误判率等方面得出结论。结果显示:DBK算法的运行时间比较长但是它相对另外两个算法的外存访问次数以及它的平均准确率比较好,平均误判率相对较低。由此说明本文算法具有比较好的准确率以及较低的误判率,从而得出本文的改进算法是一种有效并且可行的算法。
[Abstract]:With the development of P2P network technology, P2P applications are becoming more and more extensive, and P2P traffic identification has been pursued by P2P technology researchers. As there are more and more applications, it is more and more difficult to identify P2P traffic. This paper introduces P2P technology, analyzes several typical P2P traffic identification technologies, and proposes an improved algorithm from the advantages and disadvantages of these technologies. This improved algorithm is a clustering algorithm based on unsupervised learning. In this paper, the statistical characteristics of P2P traffic are analyzed from the aspect of packet level and data flow level, and the average square difference of packet size in P2P flow, the duration of P2P flow and the conversion rate of packet size in P2P flow are selected. Five characteristic attributes, such as the average number of bytes in P2P stream and the ratio of download to upload speed, are suitable for the experiment of this algorithm, which are used as the experimental verification of the later DBK algorithm. Secondly, this paper briefly introduces the advantages and disadvantages of K 鈮

本文编号:2494137

资料下载
论文发表

本文链接:https://www.wllwen.com/guanlilunwen/ydhl/2494137.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户f7329***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com