恶意代码聚类分析研究
发布时间:2018-12-10 10:19
【摘要】:随着互联网的飞速发展,恶意代码数量依然持续增长,对于恶意代码的分析研究依然处于信息安全的主要位置。为此,学术界学者对恶意代码检测、聚类、分类以及同源等方面的进行了分析研究。本文对于现在恶意代码分析现状的基础上展开了三方面的研究工作。(1)针对产业界的自动分析体系同学术界的聚类、分类以及同源等分析脱节的问题,本文提出一种新型恶意代码自动分析理论模型。在该该理论模型的基础上研究的恶意代码分类、聚类、同源以及演化等技术可以更好的运用于反病毒厂商的产品中。这种新型恶意代码自动分析理论模型统一了学术界和产业界的工作。(2)针对使用不同的数据作为分析对象从而造成的不同学术研究成果之间较难对比的现状,本文提出了恶意代码的描述规范,挑选了恶意代码家族样本,提供了开放的数据集。使得以该数据集作为研究对象的学术研究之间可以相互对比,并且也可以此为基础,提出更加准确的评价标准。(3)针对前期研究发现的恶意代码之间的松散程度不一的问题,本文设计实现了基于SNN密度的恶意代码聚类算法。该聚类算法对于样本的密度不敏感,可以很好的适应恶意代码的聚类算法。在实现过程中我们采用opcode和系统调用作为特征输入,验证了不同特征输入的SNN密度聚类算法的准确率,最高可达100%。
[Abstract]:With the rapid development of the Internet, the number of malicious code continues to grow, and the analysis of malicious code is still in the main position of information security. For this reason, scholars have analyzed and studied malicious code detection, clustering, classification and homology. On the basis of the present situation of malicious code analysis, three aspects of research work have been carried out in this paper. (1) the disconnection between the automatic analysis system of industry and the academic cluster, classification and homology analysis. This paper presents a new theoretical model for automatic analysis of malicious code. Based on this theory model, malicious code classification, clustering, homology and evolution techniques can be better used in antivirus products. This new model of automatic analysis of malicious code unifies the work of academia and industry. (2) aiming at the situation that it is difficult to compare the different academic research results caused by using different data as the object of analysis. This paper presents a description specification of malicious code, selects samples of malicious code family and provides an open data set. So that academic research that uses this data set as a research object can be contrasted with, and based on, the data set, A more accurate evaluation standard is proposed. (3) aiming at the different loose degree of malicious code found in previous studies, this paper designs and implements a malicious code clustering algorithm based on SNN density. The clustering algorithm is insensitive to the density of samples, and can adapt to the clustering algorithm of malicious code. In the process of implementation, we use opcode and system call as feature input, and verify the accuracy of SNN density clustering algorithm with different feature input.
【学位授予单位】:国防科学技术大学
【学位级别】:硕士
【学位授予年份】:2016
【分类号】:TP309
本文编号:2370412
[Abstract]:With the rapid development of the Internet, the number of malicious code continues to grow, and the analysis of malicious code is still in the main position of information security. For this reason, scholars have analyzed and studied malicious code detection, clustering, classification and homology. On the basis of the present situation of malicious code analysis, three aspects of research work have been carried out in this paper. (1) the disconnection between the automatic analysis system of industry and the academic cluster, classification and homology analysis. This paper presents a new theoretical model for automatic analysis of malicious code. Based on this theory model, malicious code classification, clustering, homology and evolution techniques can be better used in antivirus products. This new model of automatic analysis of malicious code unifies the work of academia and industry. (2) aiming at the situation that it is difficult to compare the different academic research results caused by using different data as the object of analysis. This paper presents a description specification of malicious code, selects samples of malicious code family and provides an open data set. So that academic research that uses this data set as a research object can be contrasted with, and based on, the data set, A more accurate evaluation standard is proposed. (3) aiming at the different loose degree of malicious code found in previous studies, this paper designs and implements a malicious code clustering algorithm based on SNN density. The clustering algorithm is insensitive to the density of samples, and can adapt to the clustering algorithm of malicious code. In the process of implementation, we use opcode and system call as feature input, and verify the accuracy of SNN density clustering algorithm with different feature input.
【学位授予单位】:国防科学技术大学
【学位级别】:硕士
【学位授予年份】:2016
【分类号】:TP309
【参考文献】
相关期刊论文 前4条
1 刘星;唐勇;;恶意代码的函数调用图相似性分析[J];计算机工程与科学;2014年03期
2 徐小琳;云晓春;周勇林;康学斌;;基于特征聚类的海量恶意代码在线自动分析模型[J];通信学报;2013年08期
3 何永君;舒辉;熊小兵;;基于动态二进制分析的网络协议逆向解析[J];计算机工程;2010年09期
4 陈恺;冯登国;苏璞睿;;基于延后策略的动态多路径分析方法[J];计算机学报;2010年03期
,本文编号:2370412
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/2370412.html