现场影像增强中的硬件加速机制研究

发布时间：2018-01-23 12:02

本文关键词： 现场影像增强 FPGA硬件加速高动态范围视频运动影像放大高速串行总线接口　出处：《中国科学技术大学》2017年博士论文　论文类型：学位论文

【摘要】：随着处理器性能的不断提升,图像、影像中越来越多的信息被直观化地呈现给用户。然而,面对用户在直观化成像上越来越高的应用需求,信息的数据量庞大且需要实时处理。基于图像处理平台进行性能改进的方法往往难以达到直观化成像要求,实时性和系统带宽难以保证。图像处理平台多采用单摄像头单传感器模块对图像进行采集,且平台多采用同构处理器架构,配合主机完成后处理任务,其架构更多依赖于软件算法来完成用户应用需求的响应,主机运行的软件算法处理速度较慢,达不到直观化成像的实时性要求,比如立体视觉、虚拟现实、现实增强等应用就难以用图像处理平台来满足实时性需求。此外,高分辨率、高帧率的场景影像采集需要较高的前端总线带宽来完成影像数据的传输任务,基于图像处理平台的改进并不能达到影像处理的带宽需求。因此,图像处理平台的后处理思路无法满足直观化成像所带来的实时性、带宽需求,针对用户直观化应用要求,本文提出现场影像增强方法,该方法主要完成图像增强领域中直观化成像的实时性和带宽需求,对影像信息进行还原和增强处理,得到实时的现场影像增强处理结果。利用FPGA进行算法优化和硬件加速,可以解决直观化应用中的实时性瓶颈;利用定制化的现场高速总线接口 UPI(PCIe/SRIO复用接口)可以解决直观化应用中的带宽瓶颈。现场影像增强方法分为两个部分:首先,需要对影像现场信息进行还原,以保证信息获取范围的准确性。本文采用了高动态范围成像算法对场景信息进行还原处理,并对单相机成像、多相机成像和单镜头多传感器成像等方法进行了归纳总结,提出了一种能够实时处理的高动态范围视频算法。其次,需要对场景信息进行增强处理,本文梳理了几种影像增强方法及各自优缺点,精简了欧拉影像放大方法中的拉普拉斯金字塔构建方法,实现了 ⅡR滤波的流水线处理,提出了一种快速硬件实现的欧拉影像放大算法。本文针对现场影像增强方法的几个关键问题进行了研究,并给出了系统性的硬件加速方案。主要的研究工作和创新点:(1)归纳总结了当前高动态范围图像算法和视频算法的研究成果,提出了实时的高动态范围视频算法及硬件加速方法。首先,针对高动态范围成像算法,本文提出了一种改进的Ward权值函数选取方法,并利用三阶贝塞尔函数推导了相机响应曲线的拟合公式,可以在不需要精确知道曝光时间的情况下还原照度图:同时,本文提出了一种优化的全局色调映射算子,在不影响对比度的情况下降低了高亮区域的照度值,保证图像中不会出现饱和失真;此外,本文对高动态范围视频中的频闪问题提出了硬件解决方法,采用漏积分器模型对独立计算的每帧亮度参数进行处理,使得色调映射过程中各帧亮度参数相对统一。针对算法硬件加速过程遇到的存储密集问题,本文对相机响应曲线进行四叉树压缩编码,相比于直接存储相机响应曲线的方法,本文方法可以节省至少99.6%的BRAM资源;针对算法硬件加速过程遇到的计算密集问题,本文采用多项式逼近方法将复杂的指数和对数运算简化为移位和加法运算,同时利用乒乓缓冲区进行多路并行流水,结合FPGA内嵌的DSP slice资源,加快软件算法循环语句运算速度。相比于Lapray和Mann等人提出的FPGA硬件处理平台,本文处理相同分辨率的影像所需要的时间较短,在120MHz的系统时钟下,针对分辨率为1920×1080的19.58MB标准视频数据,可在15.3ms内完成一帧视频图像的输出。(2)梳理了当前运动放大算法的研究成果,对拉格朗日影像放大方法和欧拉影像放大方法的优缺点进行了分析,论证了拉格朗日放大方法不适合硬件实现的原因。提出了一种快速硬件实现的欧拉影像放大算法,该方法通过削减金字塔数量、固定放大因子,在不影响直观化显示效果的情况下,相比于利用Matlab软件在Intel(R)Xeon(R)处理器(3.3GHz)实现的软件算法,能够获得16.1倍的硬件加速比。(3)对前端总线的定制化方法和实时图像处理平台构建方法进行了总结和归纳,提出了以多个消息队列和影像增强引擎为核心的硬件加速方法。采用两片FPGA完成现场影像增强任务,基于功能级的任务切割方法对FPGA多任务进行调度,前端总线采用PCIe总线,FPGA芯片间互联通过SRIO总线完成。本文设计了一种灵活的FPGA高速串行总线接口UPI(Unified PHY Interface),并给出相应的API函数。该接口采用共PHY的物理层架构,利用同一高速串行收发器时分传输PCle协议包和SRIO协议包,完成前端帧数据的采集任务,提高了处理平台的灵活性和带宽需求。
[Abstract]:As processor performance continues to improve, more and more image, the image information is intuitively presented to the user. However, facing the needs of users in a more intuitive to the information of the huge amount of data and require real-time processing. The image processing method based on the improved platform in the performance is often difficult to achieve visualization the requirements of imaging, real-time and bandwidth of the system. It is difficult to guarantee the image processing platform using single camera single sensor module to collect the image, and the platform of multi homogeneous processor architecture, with the host to complete the postprocessing task, its architecture is more dependent on the software algorithm to complete the response to user application requirements, software algorithms running slower host that is not up to the requirements of real-time visual imaging, such as stereo vision, virtual reality, augmented reality applications will be difficult to use image processing platform To meet the real-time demand. In addition, high resolution, scene image acquisition need high bandwidth front bus to complete the image data transmission task with high frame rate and bandwidth requirements of improved image processing platform and can not achieve the image processing based on image processing platform. Therefore, the postprocessing ideas cannot satisfy the intuitive real-time imaging brought the bandwidth requirements for the user intuitive application requirements, this scene image enhancement method, this method is mainly to complete the image enhancement of real-time and bandwidth requirements of the field of view of imaging, the image information reduction and enhancement processing, real-time field image enhancement processing results. The optimal algorithm and hardware accelerated by FPGA. Can solve the bottleneck of real-time visualization applications; using customized on-site high-speed UPI bus interface (PCIe/SRIO interface) can be solved directly The bandwidth bottleneck of the application of the scene. The image enhancement method is divided into two parts: first, the need for image information of the scene to restore, to ensure the accuracy of range information is obtained. This paper adopts the algorithm of high dynamic range imaging for reducing on the scene information, and the single phase imaging machine, multi camera imaging and single lens multi sensor the imaging methods are summarized, and put forward a kind of high dynamic range video algorithm capable of real-time processing. Secondly, the need for information of the scene is enhanced, this paper reviews several image enhancement methods and their advantages and disadvantages, construction method of Laplasse Pyramid to streamline the Euler image magnification method, realizes the pipelined R filter II the proposed image Euler implementation of a fast hardware algorithm. Aiming at the scene image magnification is studied several key problems of enhancement, And gives the system hardware acceleration scheme. The main research works and innovations: (1) summarizes the current research results of high dynamic range image algorithm and video algorithm, proposed high dynamic range video real-time algorithm and hardware acceleration method. First, according to the high dynamic range imaging algorithm is proposed in this paper. A selection method of Ward weight function improved, and using the three order Bessel function deduced camera response curve fitting formula, can accurately know the exposure time under the condition of illumination reduction in need. At the same time, this paper proposes an optimized global tone mapping operator, without affecting the contrast is reduced under the condition the highlighted area illumination, ensure that the image will not appear in the saturation distortion; in addition, this paper proposes a hardware solution for the flicker problem of high dynamic range video, using the integral model of leakage Each frame brightness parameter independent calculation processing, make the parameters of tone mapping process each frame brightness is relatively uniform. The algorithm hardware accelerated storage intensive problems met in the process, the four fork tree on camera response curve compression encoding, compared to the direct storage ring camera method should curve, this method can save at least 99.6% BRAM resources; for hardware accelerated algorithm for computing intensive problems met in the process, this paper uses the polynomial approximation method of the complex exponential and logarithm arithmetic is simplified to shift and addition operations, and the use of ping-pong buffer for multi-channel parallel pipeline, combined with embedded FPGA DSP slice resources, speed up the software algorithm loop speed. Compared to the FPGA hardware platform Lapray and Mann proposed the same resolution, the image processing required for a short time, the system clock of 120MHz, According to the 19.58MB standard video data resolution is 1920 * 1080, can complete the output of a frame of video images in 15.3ms. (2) reviews the results of research on the motion amplification algorithm, the advantages and disadvantages of image amplification method and Euler Lagrange image amplification method is analyzed, the reason that Lagrange amplification method is not suitable for hardware implementation the proposed hardware implementation of a fast Euler image magnification algorithm, this method by reducing the number of Pyramid, fixed amplification factor, without affecting the visual display effect, compared to the use of Matlab software in Intel (R) Xeon (R) processor (3.3GHz) software algorithm, can get 16.1 times the hardware acceleration ratio. (3) method of customized front bus and real-time and summarized image processing platform construction method, put forward to a plurality of message queue and image enhancement engine The core of the hardware accelerated method. Using two pieces of FPGA to complete the task of image enhancement, the task of cutting function level scheduling method based on multi task FPGA, the front bus is PCIe bus FPGA chip interconnection via SRIO bus. This paper describes the design of a flexible FPGA high speed serial bus interface UPI (Unified PHY Interface) API function, is proposed. The interface of the physical layer structure of PHY, using the same high-speed serial transceiver for time division transmission PCle protocol and SRIO protocol packets, complete front-end frame data acquisition task, improves the flexibility and bandwidth requirements of processing platform.

【学位授予单位】：中国科学技术大学
【学位级别】：博士
【学位授予年份】：2017
【分类号】：TP391.41

【参考文献】