基于流行为特征分析的网络端目标表征与识别方法研究
发布时间:2018-11-26 16:36
【摘要】:随着互联网的飞速发展,如何有效地来对网络流量和用户行为进行监管,构建一个文明健康、可信稳定的网络空间,渐渐引起了研究者们的注意。因此,如何对网络中不同的人(即端目标)进行表征与识别开始成为当前研究者们关注的一个焦点。近年来研究较多的是如何利用流行为特征对网络流进行分类,而将其应用于网络端目标的表征与识别的研究则相对较少。针对上述网络端目标表征与识别的研究现状,本文提出基于服务类型划分的分析方法,首先根据不同的服务类型对流量进行分类,并应用于网络流的行为特征的提取和选择,得到网络端目标的表征,随后引入机器学习和社团发现算法,最终完成网络端目标的识别,并取得了不错的效果。主要工作如下:(1)针对个体端目标的识别,即识别一个特定的用户行为是由哪个端目标产生的,本文引入了基于机器学习的分类方法。首先将用户的流量梳理到作者划分的24种服务类型之下,用于构建端目标的流量矩阵,接着就是对原始的数据包处理得到分析所需的相关流行为特征,经过特征选择之后最后得到用于表征一个端目标的特征参数集,如此一天的流量数据便可以转化为表征该端目标行为的一个样本。采集了足够多的样本数据之后,便得到了机器学习所需的样本数据,经过对样本数据的手工标记之后,本文采用机器学习中的C4.5决策树算法将样本数据用于训练和测试,最终取得了不错的识别效果。(2)针对个体端目标之间的行为相似性,即发现网络中潜在的社团群体,本文提出了基于流行为特征分析的社团发现算法来进行分析。由于需要衡量端目标之间的行为相似性,作者分别使用Dice相似度计算流行为特征的相似度,余弦相似度计算服务类型的相似度,构建相似度矩阵。最后利用社团发现算法分别得出基于流行为特征和服务类型的社团结构划分,综合两者的结果得到最终的社团划分结果。
[Abstract]:With the rapid development of the Internet, how to regulate the network traffic and user behavior effectively and build a civilized, healthy, credible and stable network space has gradually attracted the attention of researchers. Therefore, how to characterize and identify different people in the network has become a focus of attention. In recent years, much research has been done on how to classify network flows by using popular features, but relatively few studies have been made on their application to the characterization and recognition of network end targets. In view of the research status of target representation and recognition on the network side, this paper proposes an analysis method based on the classification of service types. Firstly, traffic is classified according to different service types, and it is applied to the extraction and selection of behavior characteristics of network flows. Then the machine learning and community discovery algorithms are introduced to realize the recognition of the target in the network, and good results are obtained. The main work is as follows: (1) for the recognition of individual target, that is, to identify which end target a particular user behavior is generated by, this paper introduces a classification method based on machine learning. First of all, the user traffic is combed under the 24 kinds of service types divided by the author, which is used to construct the traffic matrix of the end target, and then it is characterized by the related popularity needed for the analysis of the original data packet processing. After feature selection, a feature parameter set is obtained to represent an end target, so that the traffic data of a day can be transformed into a sample to represent the behavior of the end target. After collecting enough sample data, the sample data needed for machine learning is obtained. After manual marking of the sample data, this paper uses C4.5 decision tree algorithm in machine learning to train and test the sample data. Finally, a good recognition effect is achieved. (2) aiming at the behavior similarity between individual targets, that is, to find the potential community groups in the network, this paper proposes a community discovery algorithm based on popular feature analysis to analyze the behavior. Due to the need to measure the behavioral similarity between the end targets, the author uses Dice similarity to calculate the similarity of popular features and cosine similarity to calculate the similarity of service types, and constructs a similarity matrix. Finally, the community structure partition based on the popular feature and service type is obtained by using the community discovery algorithm, and the final community partition result is obtained by synthesizing the two results.
【学位授予单位】:电子科技大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP393.06
[Abstract]:With the rapid development of the Internet, how to regulate the network traffic and user behavior effectively and build a civilized, healthy, credible and stable network space has gradually attracted the attention of researchers. Therefore, how to characterize and identify different people in the network has become a focus of attention. In recent years, much research has been done on how to classify network flows by using popular features, but relatively few studies have been made on their application to the characterization and recognition of network end targets. In view of the research status of target representation and recognition on the network side, this paper proposes an analysis method based on the classification of service types. Firstly, traffic is classified according to different service types, and it is applied to the extraction and selection of behavior characteristics of network flows. Then the machine learning and community discovery algorithms are introduced to realize the recognition of the target in the network, and good results are obtained. The main work is as follows: (1) for the recognition of individual target, that is, to identify which end target a particular user behavior is generated by, this paper introduces a classification method based on machine learning. First of all, the user traffic is combed under the 24 kinds of service types divided by the author, which is used to construct the traffic matrix of the end target, and then it is characterized by the related popularity needed for the analysis of the original data packet processing. After feature selection, a feature parameter set is obtained to represent an end target, so that the traffic data of a day can be transformed into a sample to represent the behavior of the end target. After collecting enough sample data, the sample data needed for machine learning is obtained. After manual marking of the sample data, this paper uses C4.5 decision tree algorithm in machine learning to train and test the sample data. Finally, a good recognition effect is achieved. (2) aiming at the behavior similarity between individual targets, that is, to find the potential community groups in the network, this paper proposes a community discovery algorithm based on popular feature analysis to analyze the behavior. Due to the need to measure the behavioral similarity between the end targets, the author uses Dice similarity to calculate the similarity of popular features and cosine similarity to calculate the similarity of service types, and constructs a similarity matrix. Finally, the community structure partition based on the popular feature and service type is obtained by using the community discovery algorithm, and the final community partition result is obtained by synthesizing the two results.
【学位授予单位】:电子科技大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP393.06
【参考文献】
相关期刊论文 前3条
1 李乔;何慧;方滨兴;张宏莉;王雅山;;基于信任的网络群体异常行为发现[J];计算机学报;2014年01期
2 刘兴彬;杨建华;谢高岗;胡s,
本文编号:2359073
本文链接:https://www.wllwen.com/guanlilunwen/ydhl/2359073.html