安卓平台下基于相似度的恶意软件特征提取与检测研究
发布时间:2018-10-24 18:57
【摘要】:随着互联网时代的到来,智能手机在世界上的普及率也越来越高,而Android智能手机操作系统则凭借其优良的性能,获得了巨大的市场份额。可随着智能手机的发展,越来越多的手机恶意软件也出现在了市场当中,危害用户的信息安全。各大安全实验室也渐渐把手机安全保护作为重点研究,但如何有效的查杀新型恶意软件及恶意软件的变种一直是个难题。由于传统恶意软件特征码提取方法基于程序二进制文本,无法侦测新型恶意软件和变异恶意软件,本文提出一种基于相似度的安卓恶意软件特征提取方法。该方法通过使用谷歌距离计算源码中特有的信息,如API调用,安卓权限和常用参数之间的相似度,然后挖掘安卓软件中常用的关键词,再将其按照相似度分类。然后和正常软件中的关键词作对比实验,得到安卓恶意软件的特征。再通过SVM向量机对特征集合进行机器学习,使该方法获得可以不断容纳新型软件病毒样本的功能。使用该系统检测时,会对目标软件提取源码,对其中的敏感词集合与系统库中已有样本集合比对,从而可以侦测新的恶意软件以及旧型恶意软件变异体。相较于传统特征码提取法,本文的研究创新之处在于打破了以往依靠二进制上下文环境记录病毒特征的常规方法,结合整个病毒软件操作环境形成特征库,记录下病毒的行为作为特征。同时引进了当下较为先进的机器学习方法来对特征集合进行训练和分类。实验表明,该方法是行之有效的。
[Abstract]:With the advent of the Internet era, the popularity of smart phones in the world is also increasing, and the Android smartphone operating system has gained a huge market share by virtue of its excellent performance. But with the development of smart phone, more and more mobile phone malware appears in the market, endangering users' information security. The major security laboratories also gradually focus on mobile phone security, but how to effectively kill new malware and malware variants has been a difficult problem. Because the traditional malware signature extraction method is based on the binary text of the program, it can not detect the new malware and the variant malware, so this paper proposes a similarity based feature extraction method for Android malware. This method uses Google distance to calculate the specific information in the source code, such as API call, the similarity between Android permissions and common parameters, then excavates the common keywords in Android software, and classifies them according to similarity. Then compare with the keywords in normal software to get the features of Android malware. Then the feature set is learned by SVM vector machine, so that the method can continuously accommodate the new software virus samples. When the system is used, the source code will be extracted from the target software, and the sensitive word set will be compared with the existing sample set in the system library, so that new malware and old malware variants can be detected. Compared with the traditional signature extraction method, the research innovation of this paper is to break with the conventional method of recording virus features based on binary context environment, and combine the whole operating environment of virus software to form a signature library. Record the behavior of the virus as a feature. At the same time, advanced machine learning methods are introduced to train and classify feature sets. Experiments show that the method is effective.
【学位授予单位】:杭州师范大学
【学位级别】:硕士
【学位授予年份】:2016
【分类号】:TP316;TP309
本文编号:2292263
[Abstract]:With the advent of the Internet era, the popularity of smart phones in the world is also increasing, and the Android smartphone operating system has gained a huge market share by virtue of its excellent performance. But with the development of smart phone, more and more mobile phone malware appears in the market, endangering users' information security. The major security laboratories also gradually focus on mobile phone security, but how to effectively kill new malware and malware variants has been a difficult problem. Because the traditional malware signature extraction method is based on the binary text of the program, it can not detect the new malware and the variant malware, so this paper proposes a similarity based feature extraction method for Android malware. This method uses Google distance to calculate the specific information in the source code, such as API call, the similarity between Android permissions and common parameters, then excavates the common keywords in Android software, and classifies them according to similarity. Then compare with the keywords in normal software to get the features of Android malware. Then the feature set is learned by SVM vector machine, so that the method can continuously accommodate the new software virus samples. When the system is used, the source code will be extracted from the target software, and the sensitive word set will be compared with the existing sample set in the system library, so that new malware and old malware variants can be detected. Compared with the traditional signature extraction method, the research innovation of this paper is to break with the conventional method of recording virus features based on binary context environment, and combine the whole operating environment of virus software to form a signature library. Record the behavior of the virus as a feature. At the same time, advanced machine learning methods are introduced to train and classify feature sets. Experiments show that the method is effective.
【学位授予单位】:杭州师范大学
【学位级别】:硕士
【学位授予年份】:2016
【分类号】:TP316;TP309
【参考文献】
相关期刊论文 前1条
1 卿斯汉;;Android安全研究进展[J];软件学报;2016年01期
,本文编号:2292263
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/2292263.html