网络流量测量与识别关键技术研究

发布时间:2018-05-07 11:51

  本文选题:流量识别 + 流量测量 ; 参考:《解放军信息工程大学》2015年博士论文


【摘要】:网络流量测量与识别,是网络管理、网络运营、网络优化和网络安全的重要基础,是掌握网络运行规律和理解网络行为的支撑技术。随着网络技术的不断发展,用户数量大幅膨胀,链路速率快速增长,承载业务更加多样化,信息隐匿技术广泛应用,导致传统的基于知名端口号或基于报文载荷的流量识别方法无法满足高速网络流量识别的需求,迫切需要研究有效的网络流量测量与识别技术及策略以应对目前及未来网络管理面临的挑战。高速网络中,并发流数量巨大而且报文速率高,虽然使用简单的流量特征与快速的识别算法能够实现对流量的线速处理,但是难以保证流量识别的准确率;为此,现有的技术通常采用多特征及构建复杂分类模型的思路,处理复杂度高,难以满足实时性的要求;并且当前的流量识别技术没有考虑各种业务的差异化管理要求,不能实现有约束条件的业务识别。因此如何处理准确性与实时性的矛盾是网络业务在线识别的关键与难点,对不同网络业务按级别召回是网络管理的现实需求。本文依托于国家863计划重大项目课题“面向三网融合的统一安全管控网络”和863计划主题项目课题“跨网络信息安全防护”,针对课题中的网络实时识别和控制需求,面向网络业务识别中流量测量和识别算法两个核心环节,从四个方面开展研究,主要工作如下:1.针对高速网络流量识别时获取全部报文代价过大的问题,从报文约减的角度出发,提出基于同源组合布鲁姆过滤器的早期流量抽样算法。该算法利用并发流量中已结束抽样流数目远大于正在抽样流数目的特点,设计宽度不同的两个计数布鲁姆过滤器组合,分别实现“报文计数”与“抽样判断”功能。算法的理论分析表明,调节两个计数布鲁姆过滤器计数器的宽度比,可使误判率达到最低。根据真实流量进行的空间复杂度与误判率的实验证明了算法的有效性。实验结果表明:在相同内存资源限制条件下,该算法的误判率显著低于同类算法;在同样误判率指标下,与其他算法相比,其内存占用至少减少33%。2.针对采用传统计数布鲁姆过滤器算法检测大流时,无结束标识的流量导致的空间拥塞问题,提出了基于自适应超时计数布鲁姆过滤器的大流检测算法。该算法设计了计数布鲁姆过滤器与计时布鲁姆过滤器结合的大流检测结构。一方面通过计数器向量记录流的报文数量,并判断大流;另一方面通过计时器向量记录流最近报文的到达时刻,以便及时将已经结束流占用的计数器自动清除,从而解决无结束标识的流量导致的空间拥塞问题。在对该结构检测误差理论分析的基础上,提出自适应超时机制,根据链路流到达强度与布鲁姆过滤器向量空间长度,自适应调整超时时间,使得算法整体错误率始终保持在最低范围。实验结果表明:该算法的错误率优于固定超时算法的最优值,并且在占用相同内存空间条件下,与其它参考算法相比,该算法准确率最高。3.传统流量识别算法无法满足网络业务差异化分类精度要求,针对该问题,提出基于优先级分类约束的流量识别算法。该算法设计了基于分类信息熵的决策树,并提出加权的悲观错误剪枝,使最终决策树在进行分类时侧重于优先级高的业务类别,提高了优先级高的业务类别的召回率。实验结果表明,算法识别结果与优先级约束一致,并且取得建模时间和准确率的相对平衡。与标准C4.5决策树算法相比,虽然分类的整体准确率略低,但是算法对于高优先级的业务类别召回率明显高于C4.5算法,能够满足差异化分类约束条件,而且F-measure结果与C4.5算法相当。4.针对如何提高在线流量识别的处理速度问题,从流量约减这一新的角度出发,提出一种基于流集的在线流量识别方法。该方法利用相同三元组的流集合具有相同应用类别的特点,对流量集合进行在线约减,即只对具有相同三元组流集合中的部分流进行识别,根据识别的结果投票得出流集对应的业务类别。通过理论分析得出分类错误率与检测的流数量之间的关系。对算法的分类性能和处理速度进行了实验验证,结果表明:该方法可以与多种算法结合使用,并且通过选择合理的分类错误率估计阈值,分类准确率与处理速度方面比参考算法均有大幅提高。
[Abstract]:Network traffic measurement and recognition is the important foundation of network management, network operation, network optimization and network security. It is the support technology to master the law of network operation and understand the behavior of network. With the continuous development of network technology, the number of users is expanding greatly, the link rate is increasing rapidly, the carrying service is more diversified and the information hiding technology is wide. In general, the traditional traffic recognition method based on well-known port number or message based load can not meet the requirement of high speed network traffic recognition. It is urgent to study effective network traffic measurement and recognition techniques and strategies to cope with the challenges facing network management at present and in the future. Although the rate of message is high, it is difficult to ensure the accuracy of traffic identification by using simple traffic characteristics and fast recognition algorithm, but it is difficult to ensure the accuracy of flow recognition. The flow recognition technology does not take into account the differential management requirements of various services and can not realize the business identification with constraints. Therefore, how to deal with the contradiction between accuracy and real-time is the key and difficult point of network business online recognition. The recall of different network services according to the level is the actual requirement of network management. This paper is based on the country 863. The major project, "unified security management network for the integration of three networks" and "cross network information security protection" of the 863 project theme project, aims at the real-time recognition and control requirements of the network and two core links of traffic measurement and recognition algorithms in network business recognition, mainly from four aspects. The following work is as follows: 1. the early traffic sampling algorithm based on the homologous combination Bloom filter is proposed in view of the problem that the high speed network traffic recognition is too expensive to obtain all the cost of the message. The algorithm uses the homologous combination Bloom filter for the early flow sampling algorithm. Different two counting Bloom filters are combined to realize the function of "message counting" and "sampling judgment" respectively. The theoretical analysis of the algorithm shows that the error rate can be lowest by adjusting the width ratio of the two counting Bloom filter counters. The experiment of the space complexity and the error rate of real traffic proves the algorithm. The experimental results show that the error rate of the algorithm is significantly lower than that of the same algorithm under the same memory resource constraints. Under the same error rate index, the memory occupancy of the algorithm is less than 33%.2., which is at least reduced by the traffic caused by the flow without the end mark when the traditional counting Bloom filter algorithm is used to detect the large flow. A large flow detection algorithm based on adaptive timeout counting Bloom filter is proposed. The algorithm designs a large flow detection structure combining counting Bloom filter and time Bloom filter. On the one hand, it records the number of messages through the counter vector and determines the large flow rate; on the other hand, the timer vector is recorded. The arrival time of the latest message is recorded in order to automatically remove the counter that has been occupied by the end stream in time, so as to solve the problem of space congestion caused by the flow without the end mark. Based on the theoretical analysis of the detection error of the structure, an adaptive timeout mechanism is proposed, based on the arrival intensity of link flow and the vector space of Bloom filter. The result of the experiment shows that the error rate of the algorithm is better than that of the fixed timeout algorithm, and in the same memory space, the accuracy of the algorithm is the highest.3. traditional flow recognition algorithm, compared with other reference algorithms. In order to meet the precision requirement of network service differentiation classification, a traffic recognition algorithm based on priority classification constraints is proposed. This algorithm designs a decision tree based on classified information entropy, and puts forward a weighted pessimistic pruning error, so that the final decision tree is classified as a business class with high priority at the time of classification and improves the priority. The experimental results show that the algorithm recognition results are consistent with the priority constraints, and the relative balance between the modeling time and accuracy is achieved. Compared with the standard C4.5 decision tree algorithm, although the overall accuracy of the classification is slightly lower, the recall rate of the high priority service category is obviously higher than that of the C4.5 algorithm. Enough to satisfy the differential classification constraints, and the F-measure results are equivalent to the C4.5 algorithm for how to improve the processing speed of online traffic recognition. From the new point of view of traffic reduction, an online flow recognition method based on flow set is proposed. This method uses the same application category with the stream set of the same three tuples. Characteristics, the flow set is reduced online, that is, only the partial flow in the same three tuple stream set is identified, and the traffic category corresponding to the flow set is obtained according to the identified results. The relationship between the classification error rate and the quantity of the detected flow is obtained by theoretical analysis. The classification performance and processing speed of the algorithm are tested. The results show that the method can be used in combination with various algorithms, and the classification accuracy and processing speed are greatly improved by choosing a reasonable classification error rate to estimate the threshold.

【学位授予单位】:解放军信息工程大学
【学位级别】:博士
【学位授予年份】:2015
【分类号】:TP393.06

【相似文献】

相关期刊论文 前10条

1 杨铮;李国元;左敏;;一个嵌入式网络流量识别系统的设计与实现[J];计算机系统应用;2008年06期

2 辛峰;於建华;;互联网流量识别技术的研究及实现[J];广东通信技术;2008年03期

3 李晗;刘泷;;应用层流量识别方法的研究[J];广东通信技术;2008年04期

4 梁伟;李晗;;网络流量识别方法研究[J];通信技术;2008年11期

5 张玲;李君;孙雁飞;;快速应用层流量识别方法的研究与实现[J];电信快报;2009年10期

6 葛体富;;网络流量识别技术以及实现方案浅议[J];电脑知识与技术;2011年22期

7 侯艳;;基于深度包和流的流量识别系统设计[J];电子设计工程;2013年22期

8 马保雷;宋颖慧;刘亚维;;基于概念漂移检测的自适应流量识别的研究[J];智能计算机与应用;2013年06期

9 张众;杨建华;谢高岗;;高效可扩展的应用层流量识别架构[J];通信学报;2008年12期

10 吴震;刘兴彬;童晓民;;基于信息熵的流量识别方法[J];计算机工程;2009年20期

相关会议论文 前7条

1 马永立;寿国础;胡怡红;钱宗珏;区海平;;新型网络流量识别分析系统及其性能评估[A];第六届全国信息获取与处理学术会议论文集(2)[C];2008年

2 张娜娜;;P2P流量识别方法研究[A];江苏省电子学会2010年学术年会论文集[C];2010年

3 高长喜;辛阳;钮心忻;杨义先;;基于行为特征分析的P2P流量识别技术的研究[A];第一届中国高校通信类院系学术研讨会论文集[C];2007年

4 许刘兵;;基于人工神经网络的P2P流量识别模型的研究[A];中国电子学会第十五届信息论学术年会暨第一届全国网络编码学术年会论文集(上册)[C];2008年

5 贾波;邹园萍;;基于无监督学习的P2P流量识别[A];浙江省信号处理学会2011学术年会论文集[C];2011年

6 王波;周晓光;苏志远;;基于节点状态的P2P流量识别系统[A];中国电子学会第十五届信息论学术年会暨第一届全国网络编码学术年会论文集(下册)[C];2008年

7 王波;周晓光;苏志远;;基于节点状态的P2P流量识别系统[A];2008通信理论与技术新发展——第十三届全国青年通信学术会议论文集(下)[C];2008年

相关博士学位论文 前7条

1 侯颖;网络流量测量与识别关键技术研究[D];解放军信息工程大学;2015年

2 林冠洲;网络流量识别关键技术研究[D];北京邮电大学;2011年

3 田旭;互联网流量识别技术研究[D];北京邮电大学;2012年

4 彭建芬;P2P流量识别关键技术研究[D];北京邮电大学;2011年

5 张剑;宽带接入网流量识别关键技术研究[D];北京邮电大学;2011年

6 李冰;VoIP和P2P IPTV流量的识别与测量研究[D];天津大学;2010年

7 郭振滨;互联网测量与建模研究[D];北京交通大学;2012年

相关硕士学位论文 前10条

1 龚雪梅;基于用户感知的无线网络流量识别与控制的设计与实现[D];电子科技大学;2015年

2 练琪;基于聚类分析的应用层流量识别研究[D];湖南大学;2010年

3 朱欣;基于数据流挖掘技术的流量识别[D];苏州大学;2011年

4 张波;基于流特征的加密流量识别技术研究[D];哈尔滨工业大学;2012年

5 孙海霞;基于关联规则的流量识别方法研究[D];合肥工业大学;2009年

6 左建勋;网络流量识别技术研究及其应用[D];重庆大学;2007年

7 马保雷;基于概念漂移检测的自适应流量识别研究[D];哈尔滨工业大学;2013年

8 罗平;网络层流量识别与关键内容提取系统设计与实现[D];电子科技大学;2014年

9 崔月婷;基于分类算法与聚类算法流量识别系统的研究[D];北京邮电大学;2010年

10 郭明亮;高速网络中实时流量识别系统的研究与设计[D];北京邮电大学;2010年



本文编号:1856809

资料下载
论文发表

本文链接:https://www.wllwen.com/shoufeilunwen/xxkjbs/1856809.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户5cbd2***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com