深度学习的硬件实现与优化技术研究

发布时间：2018-06-06 03:21

本文选题：深度学习 + 神经网络　；参考：《哈尔滨工业大学》2017年硕士论文

【摘要】：近年来,随着人工智能的兴起,以深度学习为代表的新型智能算法在机器视觉、图像处理、模式识别等多个工程应用领域得到成功应用。但是,在工业大数据的冲击下,传统的软件实现方式无法满足实际工程低成本、高时效、高容错率的需求,因此急需寻求新的解决方案。现场可编程门阵列FPGA作为一种常用硬件开发平台,拥有大规模的分布式硬件资源,并且具有开发周期短、功耗低、性能好等特点,非常适合计算密集型的深度学习算法的实现。本文以FPGA为硬件开发平台,展开深度学习的硬件化实现与优化技术研究,主要研究内容如下:首先,深度学习硬件实现总体方案设计。详细分析深度学习的理论基础知识,并以卷积神经网络为例,进行网络的拓扑结构和功能特点研究,给出本文硬件实现的具体网络拓扑。根据网络拓扑的结构特点,进行系统的总体方案设计,将网络拓扑映射到具体的硬件电路。其次,完成算法硬件移植的优化技术与架构设计。选择FPGA作为本文实现的硬件移植平台。结合本文实现低功耗、高效率深度学习算法的目标,分别对硬件移植的优化技术进行深入研究,并应用优化技术完成对卷积神经网络从粗粒度到细粒度的并行架构设计。然后,完成基于FPGA的卷积神经网络设计与实现。以FPGA为硬件开发平台,完成卷积神经网络的整体架构设计。根据卷积神经网络的结构特点,完成设计各功能电路模块,包括卷积运算模块、抽样运算模块、激活函数模块。本文设计乒乓缓存结构,优化数据传输结构和数据缓存单元。用仿真软件Modelsim分别验证各模块功能正确性。最后,搭建系统整体实验平台。依据现有的实验条件,配置网络结构与参数,设计“FPGA+CPU”的异构体系,完成卷积神经网络的硬件固化。以手写数字识别为具体应用,完成软件和硬件的对比实验。通过大量的实验统计,结果表明本文设计的基于FPGA的卷积神经网络功能完整,性能优异。
[Abstract]:In recent years, with the rise of artificial intelligence, a new intelligent algorithm, represented by deep learning, has been successfully applied in many engineering applications such as machine vision, image processing, pattern recognition and so on. However, under the impact of industrial big data, the traditional software implementation method can not meet the needs of low cost, high aging and high fault tolerance in practical projects, so it is urgent to find new solutions. Field Programmable Gate Array (FPGA), as a common hardware development platform, has large scale distributed hardware resources, short development cycle, low power consumption and good performance, so it is very suitable for the implementation of computationally intensive depth learning algorithm. In this paper, the hardware implementation and optimization technology of deep learning is studied on the platform of FPGA. The main contents are as follows: firstly, the overall scheme of hardware implementation of deep learning is designed. The basic theoretical knowledge of deep learning is analyzed in detail. Taking convolutional neural network as an example, the topological structure and functional characteristics of the network are studied, and the specific network topology realized by hardware in this paper is given. According to the structural characteristics of network topology, the overall scheme of the system is designed, and the network topology is mapped to the specific hardware circuit. Secondly, the optimization technology and architecture design of algorithm hardware transplantation are completed. FPGA is chosen as the hardware porting platform of this paper. Combined with the goal of realizing low power and high efficiency deep learning algorithm in this paper, the optimization technology of hardware transplantation is studied in depth, and the parallel architecture design of convolution neural network from coarse-grained to fine-grained is completed by using optimization technology. Then, the design and implementation of convolution neural network based on FPGA are completed. Using FPGA as hardware development platform, the overall architecture design of convolutional neural network is completed. According to the structural characteristics of the convolution neural network, the functional circuit modules are designed, including convolution operation module, sampling operation module and activation function module. This paper designs ping-pong cache structure, optimizes data transmission structure and data cache unit. The functional correctness of each module is verified by simulation software Modelsim. Finally, the whole experiment platform is built. According to the existing experimental conditions, configuration of network structure and parameters, design of "FPGA CPU" heterogeneous system, complete the hardware solidification of the convolutional neural network. Taking handwritten digit recognition as the concrete application, the contrast experiment between software and hardware is completed. Through a large number of experimental statistics, the results show that the convolution neural network based on FPGA has complete function and excellent performance.
【学位授予单位】：哈尔滨工业大学
【学位级别】：硕士
【学位授予年份】：2017
【分类号】：TP18

【参考文献】