基于ROI和JND的3D视频编码研究

发布时间：2018-02-01 03:14

本文关键词： 3D视频编码深度信息 ROI JND H.264 量化参数　出处：《北京交通大学》2017年硕士论文　论文类型：学位论文

【摘要】：3D视频在带给人身临其境的视觉体验的同时,也因其庞大的数据量而给视频的存储和传输带来了巨大的压力。因此,如何在保证视频主观质量的前提下,尽可能占用更小的传输带宽是3D视频编码领域亟待解决的重大挑战。以往的编码算法主要集中精力于去除视频的空间冗余、时间冗余等,而我们将注意力转向于去除视频中的视觉冗余。为此,本文提出把人眼视觉特性与现有的H.264编码框架相结合,着力于利用3D视频中的感兴趣区域和恰可察觉失真,在保证主观质量的前提下,尽可能减少码率,最终提高压缩效率。感兴趣区域(Region Of Interest,ROI)编码是指通过控制视频背景区域与ROI区域的宏块量化参数(Quantization Parameter,QP)的分配,在保证视频主观质量的同时提高压缩效率。对2D视频而言,ROI检测性能不稳定,这限制了 ROI编码的推广与使用。而3D视频中包含的深度信息与人类视觉模型(HVS)中感兴趣程度间有着十分密切的联系,这为3D视频ROI区域检测提供了有利条件。因此本文综合深度信息,提出了两种3D显著性检测算法。恰可察觉失真(Just Noticeable Distortion,JND)是指由人类视觉系统的生理特性和心理特性所造成的,对图像不同区域具有不同失真敏感度的现象。当图像特定区域的失真程度低于JND阈值时,人眼无法感知其存在。JND视频编码技术主要针对视频的视觉冗余,在编码时结合人眼视觉特性,合理分配编码资源,进一步提高编码的效率。本文主要利用3D视频(单视点视频加深度图格式)中的纹理和深度信息,对其进行与H.264标准兼容的ROI和JND编码的研究。本文首先研究了人眼立体视觉系统模型,分析了景物深度与HVS中感兴趣程度间的关系。由此,提出了基于深度的立体投影显著性检测算法;进一步发掘深度和场景中背景区域的关系,提出了基于背景检测的3D显著性检测算法。针对人眼对不同深度的注意程度不同,以及不同视点间物体的相互遮盖等关系,提出了一种利用深度图计算JND阈值的模型。之后,探讨了 H.264压缩标准和压缩后比特率的构成和调整方式。据此,在视频进行H.264压缩前,通过结合ROI和JND对视频帧的区域划分,建立更加符合人眼视觉特性的分级量化模型,指导人眼感兴趣区域量化参数的选取,进一步提升ROI区域的主观质量并提高编码效率。最后,从理论和实验两方面,分析了这个分级量化策略对视频压缩后比特率的影响。实验结果表明本文的方案在同等码率下,以较低失真保存了人眼视觉敏感区域,为用户提供了较好的视觉体验。
[Abstract]:3D video not only brings people the experience of visual experience, but also brings great pressure to the storage and transmission of video because of its huge amount of data. Therefore, how to ensure the subjective quality of video. Using as little transmission bandwidth as possible is a major challenge to be solved in 3D video coding field. Previous coding algorithms mainly focus on removing spatial redundancy and time redundancy of video. We turn our attention to removing visual redundancy in video. Therefore, this paper proposes to combine human visual characteristics with the existing H.264 coding framework. We focus on making use of the region of interest and detectable distortion in 3D video to reduce the bit rate as much as possible on the premise of ensuring subjective quality. Finally, the compression efficiency is improved. The region of interest (ROI) is region of Interest. ROI) coding refers to the allocation of quantization parameters of macroblock quantization by controlling the video background area and the ROI region. The subjective quality of video is guaranteed and the compression efficiency is improved. For 2D video, the performance of ROI detection is unstable. This limits the promotion and use of ROI coding, and there is a close relationship between the depth information contained in 3D video and the degree of interest in the human visual model. This provides a favorable condition for 3D video ROI region detection, so this paper synthesizes depth information. Two 3D salience detection algorithms are proposed, which can detect the distortion and just Noticeable Distortion. JND) is a phenomenon caused by the physiological and psychological characteristics of human visual system and has different distortion sensitivity to different regions of the image. When the distortion degree of a particular region of the image is lower than the threshold of JND. The human eye can not perceive its existence. JND video coding technology is mainly aimed at the visual redundancy of video. When coding, combining with the visual characteristics of human eyes, reasonable allocation of coding resources. Further improve the efficiency of coding. This paper mainly uses the texture and depth information in 3D video (single view video plus depth map format). The ROI and JND codes which are compatible with H.264 standard are studied. Firstly, the model of human stereoscopic vision system is studied in this paper. The relationship between the depth of scene and the degree of interest in HVS is analyzed. Therefore, an algorithm of stereoscopic projection salience detection based on depth is proposed. Further explore the relationship between depth and background area in the scene, a 3D salience detection algorithm based on background detection is proposed. A model for calculating JND threshold by depth map is proposed. The structure and adjustment of H.264 compression standard and compressed bit rate are discussed. According to this, the video frame is divided by combining ROI and JND before H.264 compression. Establish a hierarchical quantization model more in line with the human visual characteristics, guide the selection of quantization parameters of the region of interest, further improve the subjective quality of the ROI region and improve the coding efficiency. Finally. The effects of this hierarchical quantization strategy on the bit-rate of video compression are analyzed theoretically and experimentally. The experimental results show that the proposed scheme preserves the human visual sensitive region with low distortion at the same bit rate. For users to provide a better visual experience.
【学位授予单位】：北京交通大学
【学位级别】：硕士
【学位授予年份】：2017
【分类号】：TN919.81

【参考文献】