JPEG2000编解码器的优化与验证
发布时间:2018-06-13 16:32
本文选题:JPEG2000 + DWT算法 ; 参考:《哈尔滨工业大学》2016年硕士论文
【摘要】:JPEG2000静止图像压缩标准采用基于线性提升格式的DWT(Discrete Wavelet Transform)算法和EBCOT(Embedded Block Coding with Optimized Truncation)算法,因而拥有较JPEG标准更加优越的性能。对于运算密集的DWT算法和以比特位为编码单位循环编码的EBCOT算法,十分需要硬件加速与优化。JPEG2000编码和解码部分拥有相似的优化策略。针对DWT算法或IDWT算法,本文基于两种一维小波变换核的变形格式,同时将设计的数据流图映射为高度复用、关键路径短且控制复杂度低的流水线硬件电路。EBCOT或EBDOT算法中,针对平面扫描算法,统一处理三个扫描通道减少硬件资源,并以条带列为处理单位,条带列和上下文状态窗口生成流水处理,大幅减少处理周期;针对MQ算法,通过逻辑重组简化关键处理路径,在牺牲极少的存储资源情况下预判重归一化的左移次数,大幅减少时钟周期;针对Tier-2标签树的硬件编解码,设置行列奇偶属性标签和父节点已编码标志行缓存便可实现任意分辨率图像编解码。编码时需额外考虑码率控制,遍历比较得到已编码码块的末位通道和当前编码的通道最小率失真值,选择跳过当前码块编码,极大减少遍历次数和冗余编码过程,图像失真略小于标准算法。基于上述硬件优化方案,用Verilog HDL语言对硬件编解码器进行RTL级描述,采用基于“黄金模型”验证策略,自动化比对关键验证点信息,完成功能仿真和时序仿真。之后搭建硬件SoC系统,并以IP核挂载的方式嵌入硬件编解码器,进行系统仿真和软件调试,完成FPGA验证。FPGA综合结果表明,硬件编解码器时钟频率均在170Mhz,硬件开销较小,20倍压缩比下能较好地平衡编解码时间、图像性能损失以及编解码实时性等指标。
[Abstract]:JPEG2000 still image compression standard adopts DWTG discrete Wavelet transform algorithm based on linear lifting scheme and EBCOT embedded Block coding with optimized algorithm, so it has better performance than JPEG standard. For the dense DWT algorithm and the EBCOT algorithm based on bits, it is very necessary for hardware acceleration and optimization. JPEG2000 coding and decoding have similar optimization strategy. For DWT algorithm or IDWT algorithm, based on two deformed schemes of one-dimensional wavelet transform kernel, the designed data flow diagram is mapped to pipeline hardware circuit. EBCOT or EBDOT algorithm, which has high multiplexing, short critical path and low control complexity. For plane scanning algorithm, three scanning channels are processed uniformly to reduce hardware resources, and pipeline processing is generated by strip column and context state window. The key processing path is simplified by logic recombination, and the number of renormalized left shifts is forecasted at the expense of very few storage resources, and the clock cycle is greatly reduced, and the hardware encoding and decoding for Tier-2 tag tree is also proposed. By setting the column parity attribute label and the parent node encoded flag line cache, the arbitrary resolution image encoding and decoding can be realized. When coding, the rate control should be taken into account, the traversal comparison can get the minimum rate-distortion value of the last channel of the coded block and the current coded channel, and skip the current block coding, greatly reduce the number of traversal and redundant coding process. Image distortion is slightly less than the standard algorithm. Based on the above hardware optimization scheme, the hardware codec is described at RTL level with Verilog HDL language, and the verification strategy based on "gold model" is adopted to automatically compare the information of key verification points, and complete the function simulation and timing simulation. Then the hardware SoC system is built, and the hardware codec is embedded in the IP core mount mode, and the system simulation and software debugging are carried out, and the FPGA verification. FPGA synthesis results show that, The clock frequency of the hardware codec is 170 MHz, and the hardware cost is less than 20 times compression ratio, which can balance the coding and decoding time, the loss of image performance and the real-time performance of the codec.
【学位授予单位】:哈尔滨工业大学
【学位级别】:硕士
【学位授予年份】:2016
【分类号】:TN919.81
,
本文编号:2014658
本文链接:https://www.wllwen.com/kejilunwen/xinxigongchenglunwen/2014658.html