基于代码分割的隐写软件识别技术研究

发布时间：2018-09-10 08:34

【摘要】：隐写取证已经成为了当前计算机取证所面临的严峻挑战之一。在隐写取证中,倘若取证者能够识别出取证目标中的隐写软件,则将能避开当前隐写取证所面临的困难,为目标的发现和秘密信息的提取提供线索和依据。本文在总结当前隐写软件识别及其相关的软件特征技术研究现状的基础上,对隐写取证的重要环节——隐写软件识别问题进行了较为深入的探讨。具体工作如下: 1、从取证流程的角度,对原有隐写取证步骤进行了扩展,将隐写软件识别问题纳入隐写取证过程之中,丰富了隐写取证的手段,拓展了隐写取证的内涵。 2、基于代码分割的思想,提出了三种代码分割方法,给出了一个基于代码分割的隐写软件识别框架。该框架基于代码分割能够将一个复杂的程序(或软件)分割成若干较易理解的代码片段,降低了程序理解的复杂度。 3、提出了一种基于“指令词”的隐写软件识别算法。该算法基于k-gram原理将程序指令操作码序列近似分割成若干易于理解、具有独立功能的代码片段,构建“指令词”;统计隐写软件中出现频率较高的“指令词”,构建基于“指令词”的软件特征向量,并用“指令词”出现的频数对特征向量量化;运用向量夹角余弦刻划待识别软件相对于目标隐写软件的匹配程度。实验结果表明:该算法不仅能够区分目标隐写软件与其他软件,较为可靠地识别出经过代码迷惑变换的隐写软件,而且可识别经过升级、捆绑的隐写软件“变种”,并区分不同的隐写软件。 4、提出了一种基于寄存器依赖的隐写软件识别算法。算法根据程序指令间的寄存器依赖关系,将程序的寄存器依赖图分割成若干路径,并将分割得到的各条路径上的指令序列作为代码模块;运用待识别软件与目标隐写软件的代码模块中的最大共有子序列的相对长度刻划模块间的匹配度,运用二分图匹配的思想刻划待识别软件相对于目标隐写软件的匹配度。实验结果表明:该算法对于隐写软件的变种能够有效地识别。最后,对全文工作进行了总结,对未来的研究进行了展望。
[Abstract]:Steganography has become one of the severe challenges of computer forensics. In steganography, if the forensics can identify the steganography software in the object of evidence, it will avoid the difficulties of steganography and provide clues and basis for the discovery of targets and the extraction of secret information. On the basis of summarizing the current research status of steganography software recognition and related software feature technology, this paper makes a deep discussion on steganography software recognition, which is an important link in steganography. The specific work is as follows: 1. From the point of view of evidence flow, the original steganography procedure is extended, and the steganography software identification is brought into the steganography process, which enriches the means of steganography. This paper extends the connotation of steganography. 2. Based on the idea of code segmentation, three methods of code segmentation are proposed, and a framework of steganography software recognition based on code segmentation is presented. Based on code segmentation, the framework can break a complex program (or software) into more understandable pieces of code. The complexity of program understanding is reduced. 3. A steganography software recognition algorithm based on instruction word is proposed. Based on the principle of k-gram, the algorithm approximately divides the sequence of program instruction opcodes into several easy to understand code fragments with independent function, and constructs "instruction words", which appear frequently in statistical steganography software. The software feature vector based on "instruction word" is constructed, and the frequency pair feature vector quantization of "instruction word" is used, and the matching degree of the software to be identified relative to the target steganography software is characterized by vector angle cosine. The experimental results show that the algorithm can not only distinguish the target steganography software from other software, but also identify the "variant" of the upgraded and bundled steganography software. Finally, a register-based steganography software recognition algorithm is proposed. According to the register dependence relation between program instructions, the register dependency graph of the program is divided into several paths, and the sequence of instructions on each path is taken as the code module. The matching degree of the module is described by using the relative length of the maximum common subsequence in the code module of the steganography software and the matching degree of the target steganography software is described by the idea of bipartite graph matching. Experimental results show that the algorithm can effectively identify the steganography software variants. Finally, the paper summarizes the work of the paper and looks forward to the future research.
【学位授予单位】：解放军信息工程大学
【学位级别】：硕士
【学位授予年份】：2011
【分类号】：TP311.5;D918.2

【参考文献】