基于流量分析的Tor内容分类研究
发布时间:2018-02-06 01:51
本文关键词: 匿名通信 Tor-Meek 内容分类 数据分片 流量混淆 出处:《北京交通大学》2017年硕士论文 论文类型:学位论文
【摘要】:近年来,随着网络安全事件频发,网络安全提升到国家战略高度并予以重视。匿名通信技术能够从通信实体和通信关系两个层面,为网络提供更加安全的保护作用。然而,匿名通信技术对网络行为隐藏,引发恶意用户利用该技术从事非法、恶意的网络活动,从而,给网络安全防护造成了巨大的威胁,同时,增加了网络取证的难度。Tor作为匿名通信最为典型的应用,通过集成传输插件Meek实现流量混淆,达到了避免过滤攻击的目的。本文通过对Tor-Meek流量进行识别与分析,基于分片处理的结果,采用机器学习方法从二分类和多分类两种方式,对流量内容分类进行研究,实验表明,本文提出的基于流量分析的Tor-Meek内容分类的方法能够有效分类匿名通信内容,对网络安全防护技术有着重要作用。本文从以下四个方面对基于Tor-Meek流量内容分类进行研究:(1)首先,对Tor匿名通信技术进行介绍,涵盖三方面内容:匿名通信发展历程、Tor匿名通信技术以及Tor网桥技术。本文重点研究Meek使用的流量混淆技术,提炼出Meek的关键技术实现,其中包括前置域名技术、服务器名查询技术和内容分发网络技术。(2)提出Tor-Meek流量识别方法,采用静态特征与流动态特征结合进行流量识别。该识别过程先进行TLS数据包识别,再使用Meek静态特征进行二次识别,然后使用Polling动态特征做关键识别,最终标定识别出的Tor-Meek流量。(3)提出从流量分析的角度进行内容分类,根据流量分析统计分析,选定19个分类特征参数。使用数据分片模型对标定的Tor-Meek分片处理,再以分片为分类对象使用内容分类模型做分类处理。采用Libsvm作为分类工具,提出多分类和二分类两种方式进行内容分类。最后设计分类实验,以惩罚参数和分片大小为实验变量,使用准确率、召回率和精度作为评价指标,评估本文提出的Tor-Meek内容分类方法。(4)最后对本次论文的工作进行总结,提出该项研究的两点未来展望,其一是对多分类实验方法的改进和优化提高多分类的准确性,其二是通过用户行为建模,实现用户行为画像。
[Abstract]:In recent years, with the frequent occurrence of network security events, network security has been raised to the national strategic level and paid attention to. Anonymous communication technology can be from the communication entity and communication relationship two levels. However, anonymous communication technology hides the network behavior, causing malicious users to engage in illegal and malicious network activities. At the same time, it increases the difficulty of network forensics. Tor, as the most typical application of anonymous communication, realizes traffic confusion through integrated transmission plug-in Meek. Through the identification and analysis of Tor-Meek traffic, based on the results of slice processing, machine learning method from two classification and multiple classification methods are adopted. The research on traffic content classification shows that the proposed Tor-Meek content classification method based on traffic analysis can effectively classify anonymous communication content. Network security protection technology plays an important role. This paper studies the classification of traffic content based on Tor-Meek from the following four aspects: 1) first of all, the anonymous communication technology of Tor is introduced. It covers three aspects: anonymous communication technology and Tor bridge technology. This paper focuses on the traffic confusion technology used in Meek. The key technologies of Meek are extracted, including predomain name technology, server name query technology and content distribution network technology. (2) Tor-Meek traffic identification method is proposed. The static feature and the flow feature are used to identify the flow. The identification process is based on the TLS packet recognition, and then the Meek static feature is used for the secondary recognition. Then the Polling dynamic feature is used as the key recognition, and the identified Tor-Meek traffic is finally calibrated. (3) the content classification is proposed from the point of view of traffic analysis. According to the statistical analysis of flow analysis, 19 classification characteristic parameters were selected, and the calibrated Tor-Meek slicing was processed by using the data slicing model. Then using the content classification model as the classification object, using Libsvm as the classification tool, we propose two methods of content classification, multi-classification and two-classification. Finally, the classification experiment is designed. Penalty parameters and slice size are used as experimental variables, and the accuracy, recall rate and precision are used as evaluation indicators. Finally, the paper summarizes the work of this paper, and puts forward two future prospects of this study. One is to improve and optimize the multi-classification experimental method to improve the accuracy of multi-classification, the other is to realize user behavior portrait through user behavior modeling.
【学位授予单位】:北京交通大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP393.08
【参考文献】
相关期刊论文 前6条
1 何高峰;杨明;罗军舟;张璐;;Tor匿名通信流量在线识别方法[J];软件学报;2013年03期
2 张璐;罗军舟;杨明;何高峰;;基于时隙质心流水印的匿名通信追踪技术[J];软件学报;2011年10期
3 孙知信;张玉峰;;基于多维支持向量机的P2P网络流量识别模型[J];吉林大学学报(工学版);2010年05期
4 刘颖秋;李巍;李云春;;网络流量分类与应用识别的研究[J];计算机应用研究;2008年05期
5 段桂华,杨路明,王伟平,宋虹;一种基于洋葱路由的可撤销匿名通信方案[J];计算机工程与应用;2005年13期
6 张学工;关于统计学习理论与支持向量机[J];自动化学报;2000年01期
,本文编号:1493291
本文链接:https://www.wllwen.com/guanlilunwen/ydhl/1493291.html