基于动态二进制分析的协议模型逆向提取及其应用研究
发布时间:2018-04-24 01:25
本文选题:协议逆向工程 + 动态二进制分析 ; 参考:《国防科学技术大学》2014年博士论文
【摘要】:随着Internet的迅猛发展,基于计算机网络的应用逐步渗透到人类社会各个领域。网络协议,特别是密码协议作为整个计算机网络的基本技术支撑,其自身的可用性、可靠性与安全性显得尤为重要,因此协议自动逆向工程相关研究近年来逐渐成为人们的研究热点和主要方向。协议模型是协议逆向工程的一个重要目标,抽象描述了应用程序的动态网络行为,在协议安全性分析、协议程序验证、协议指纹识别等方面都具有重要应用价值。本文以解决协议模型逆向提取为根本目标,针对逆向分析实践中存在协议消息域字段及其语义难以准确推断,加密网络数据流难以解析,协议时序逻辑及其状态转换关系难以推理,复杂网络应用程序代码难以分析等难点技术问题,提出了一套构建于程序动态二进制分析基础之上的协议模型逆向提取方法,主要研究如何根据网络应用程序的动态执行过程逆向获取协议消息格式、协议模型、协议规范等问题,并在此基础上研究了一种协议模型指导下的协议偏离挖掘方法,提出了一种基于协议偏离的程序指纹自动提取与识别方法。本文主要贡献与创新点包括以下几个方面的内容:(1)深入而广泛地综述了协议逆向工程及程序动态二进制分析技术领域的研究现状与最新进展。针对当前协议验证、程序网络行为分析、协议漏洞挖掘等问题,从网络流分析与主机分析两个层面对协议逆向工程技术进行了介绍,并对现有方案与机制进行了分类,归纳总结各种方法的优缺点及应用范围,从而明确了论文的主要工作。针对本课题的重要技术支撑——程序动态二进制分析技术的相关理论进行了深入研究,描述了污点传播分析、动态二进制插桩DBI等关键技术的原理,同时还介绍并总结了各类动态二进制分析平台的优缺点。(2)提出了一套基于程序动态二进制分析的消息格式逆向解析方法。加密网络数据流分析与协议消息域字段的识别及域语义推理一直是协议逆向工程所面临的技术挑战,根本原因在于逆向分析方法自身,以及协议信息难以逆向获取等固有因素。本文结合主机加解密行为语义知识,提出了一种在函数级与指令级语义层面上的消息域语义属性逆向推理方法,以及一种基于库函数调用级与指令级的混合污点分析技术,解决库函数调用级污点分析技术分析精度不高、应用范围窄,以及指令级污点分析技术语义获取困难等问题;并在此基础上提出了一种能够逆向解析密码协议加密消息格式的方法,解决了目前基于网络流的协议逆向分析技术无法分析加密消息的问题。(3)提出了一种基于协议网络行为消息交互图挖掘的分布式多角色协议模型逆向推断技术。协议模型抽象描述了网络应用程序的动态网络行为,然而对于现代网络协议,特别是以密码学机制为基础的安全协议,往往具有复杂的协议时序逻辑及状态转换,因此从网络应用程序中逆向恢复协议模型具有相当的难度和挑战。本文应用状态机相关理论与方法,提出了一种基于协议网络行为消息交互图挖掘的分布式多角色协议模型逆向推断技术,能够在协议交互过程中存在多个角色主体参与会话的情形下,逆向提取密码协议应用程序的协议模型,并在此基础上提出了一种从协议状态机模型到形式化协议规范描述的转换算法,能够根据高级协议描述语言的相关定义,自动地将逆向提取到的协议模型描述为形式化的协议规范。(4)提出了一种在协议模型指导下的协议偏离自动挖掘方法。协议偏离描述了协议各版本实现程序在实际网络行为上的差异。鉴于协议偏离在协议实现程序验证、协议指纹提取等领域的应用价值,本文提出了一种在协议模型指导下的协议实现偏离自动挖掘方法。该方法通过对被测协议实现程序执行一系列的主动迭代测试来不断发掘协议各版本实现程序中所存在的偏离,并在此过程中不断调准逆向推理的协议模型,实现提高逆向分析精度的目的。(5)提出了一种基于协议偏离的程序协议指纹自动提取与识别方法。针对传统协议指纹提取存在耗费大量时间和人力的问题,本文结合协议偏离的特点,首次提出了程序协议指纹自动提取与识别方法,其关键思想在于通过观察网络应用程序的消息处理动态执行过程来提取协议特征,因此能够用于对密码协议通信程序的协议指纹识别。以协议偏离会话流层面与协议偏离响应消息层面为切入点,在协议指纹自动提取方法上,论文首先结合协议会话流特征的TPFSM描述以及协议偏离响应消息的特点,提出了协议特征提取方法;其次对协议指纹库的构造与优化进行了研究。在协议指纹自动识别方法上,论文首先提出了会话流编码以及SHINGLE(连续节点序列样本)的概念,然后在会话流层面提出了基于SHINGLE的会话流特征匹配算法以及基于正则表达式的消息特征匹配方法。本文研究是对协议逆向工程技术领域的一次有益实践与探索,研究成果对于未来继续开拓协议程序验证、程序网络行为分析、协议漏洞挖掘等应用领域具有重要的理论价值与实践意义,对完善与发展网络安全领域起到了积极推动作用。
[Abstract]:With the rapid development of Internet, the application of computer network has gradually penetrated into every field of human society. Network protocol, especially the cryptographic protocol, as the basic technical support of the whole computer network, is particularly important for its own availability, reliability and security. Therefore, the research of protocol automatic reverse engineering related research has been carried out in recent years. The protocol model is an important target in the research of the protocol reverse engineering, which abstractly describes the dynamic network behavior of the application. It has important application value in the aspects of protocol security analysis, protocol verification, protocol fingerprint recognition and so on. This paper is based on the solution of the reverse extraction of the protocol model. Aiming at the difficulty of accurate inference of the protocol message domain and its semantics in reverse analysis practice, the encrypted network data flow is difficult to parse, the temporal logic of the protocol and the state transformation relationship are difficult to be reasoned, the complex network application code is difficult to analyze and other difficult technical problems, and a set of dynamic binary analysis bases built on the program is proposed. On the basis of the reverse extraction method of protocol model, this paper mainly studies how to reverse the protocol message format, protocol model, protocol specification and so on according to the dynamic execution process of the network application, and then studies a protocol deviation mining method under the guidance of the protocol model, and proposes a program based on protocol deviation. The main contributions and innovation points of this paper include the following aspects: (1) the research status and latest progress in the field of protocol reverse engineering and program dynamic binary analysis are reviewed in depth and widely. Two layers of network flow analysis and host analysis are introduced in the face of protocol reverse engineering technology, and the existing schemes and mechanisms are classified, the advantages and disadvantages and application scope of various methods are summarized, thus the main work of the paper is clarified. The theory is deeply studied, the principle of the key technologies such as the analysis of the blot propagation, the dynamic binary insertion of DBI and other key technologies is described. At the same time, the advantages and disadvantages of all kinds of dynamic binary analysis platforms are introduced and summarized. (2) a set of message lattice inverse analysis method based on the dynamic binary analysis of the program is proposed. The recognition of the message domain and the domain semantic reasoning have always been the technical challenges in the reverse engineering of the protocol. The fundamental reason is the reverse analysis method itself and the inherent factors which are difficult to retrieve the protocol information. This paper presents a semantic knowledge of the host encryption and decryption behavior, and proposes a cancellation of the semantic level of the function level and the instruction level. The inverse reasoning method of semantic property of interest domain and a mixed stain analysis technique based on the call level and instruction level of the library function are used to solve the problem of poor analysis precision, narrow application scope and difficulty in semantic acquisition of instruction level stain analysis technology. The method of encrypting message format by cryptographic protocol solves the problem that the protocol reverse analysis technology based on network flow can not analyze the encrypted message. (3) a distributed multi role protocol inverse inference technology based on protocol network behavior message interaction graph mining is proposed. The protocol model abstracts the network application. Dynamic network behavior, however, for modern network protocols, especially the cryptographic mechanism based security protocols, often has complex protocol temporal logic and state transformation. Therefore, it is quite difficult and challenging to reverse the protocol model from the network application. In this paper, a distributed multi role protocol inverse inference technology based on protocol network behavior message interaction graph mining can be used to extract the protocol model of the cryptographic protocol application, and a protocol state machine model to form is proposed on the basis of the presence of multiple role entities involved in the session. The conversion algorithm described by the protocol specification can automatically describe the reverse extracted protocol model as a formal protocol specification according to the related definition of the high-level protocol description language. (4) a protocol deviation automatic mining method under the guidance of the protocol model is proposed. The protocol deviation describes the implementation of each protocol version of the protocol in practice. In view of the difference in network behavior, in view of the application value of protocol deviation in the domain of protocol implementation verification and protocol fingerprint extraction, this paper proposes a method of automatic deviation mining for protocol implementation under the guidance of protocol model. This method continuously excavates the protocol through a series of active iterative tests on the program implemented by the protocol. There is a deviation in the implementation of each version, and in this process, the protocol model of reverse inference is constantly adjusted to improve the accuracy of reverse analysis. (5) a method of automatic fingerprint extraction and recognition based on protocol deviation is proposed. In this paper, based on the characteristics of protocol deviation, the method of automatic fingerprint extraction and recognition of program protocol is proposed for the first time. The key idea is to extract protocol features by observing the dynamic execution process of message processing in the network application, so it can be used to identify the protocol fingerprint of the communication program of the cryptographic protocol. With the protocol deviation response message level as the breakthrough point, in the protocol fingerprint automatic extraction method, the paper first combines the TPFSM description of the protocol session flow characteristics and the characteristics of the protocol deviation message, and proposes the protocol feature extraction method. Secondly, the construction and optimization of the protocol fingerprint library are studied. In this paper, the concept of session flow coding and SHINGLE (continuous node sequence sample) is first proposed. Then, a session flow feature matching algorithm based on SHINGLE and a message feature matching method based on regular expressions are proposed at the session flow level. This paper is a useful practice and exploration in the field of protocol reverse engineering. The research results have important theoretical and practical significance for the future development of protocol verification, program network behavior analysis, protocol vulnerability mining and other applications. It has played an active role in improving and developing the network security field.
【学位授予单位】:国防科学技术大学
【学位级别】:博士
【学位授予年份】:2014
【分类号】:TP393.04
,
本文编号:1794523
本文链接:https://www.wllwen.com/guanlilunwen/ydhl/1794523.html