基于3D-HEVC的三维视频编码快速算法研究

发布时间：2018-06-14 19:47

本文选题：3D-HEVC + 编码单元　；参考：《浙江大学》2017年博士论文

【摘要】：3D视频系统可以为观众提供身临其境的视觉体验。近年来,随着显示技术的提升,3D视频得到广泛应用,如3D电影、自由视点电视、3D家庭娱乐以及虚拟现实系统等。然而,相比于传统的2D视频,3D视频往往包含多个视点的纹理图和深度图,以及一些辅助信息,数据量更大,给存储空间和传输带宽造成更大压力。为了有效地压缩3D视频数据,国际上的标准化组织成立了 3D视频编码扩展联合协作组,在2D高效率视频编码标准(High Efficiency Video Coding,HEVC)的基础上推出了 3D扩展版本3D-HEVC,较以前的多视点视频编码标准可以实现更高的压缩比。3D-HEVC继承了 HEVC的四叉树划分结构,通过引入一些新技术,极大提高了压缩效率,但同时也增加了计算复杂度。因此,如何充分利用3D-HEVC的新特性,降低其复杂度,推进其实时化应用成为一个亟待解决的重要问题。本文正是在这样的背景下,展开了对3D-HEVC快速算法的研究,分别针对编码单元尺寸选择、深度图帧内预测和深度图帧率提升进行了优化:为了降低编码单元树划分的复杂度,本文对3D视频中的基本视点和非基本视点提出一种基于时空和视点相关性的编码单元尺寸选择快速算法。最先被编码的基本视点,由于没有可参考视点,不进行视差矢量预测,所以利用当前视点内编码单元四叉树分层划分的时空相关性,跳过太大或太小的尺寸尝试,从而降低了尺寸选择耗时。当基本视点编码重建完成后,根据三维空间变换关系将其映射到待编码的非基本视点,在此过程中将生成空洞标记图。对于非基本视点中的深度视频,可通过空洞标记图中对应位置的空洞信息提前终止编码单元树的递归划分;对于非基本视点中的纹理视频,可结合空洞信息和视点间的相关性来加速编码单元树的划分过程。针对深度图的帧内编码,提出一种基于灰度共生矩阵的深度图快速帧内预测算法。该算法对深度图中的每个编码单元在进行帧内预测之前生成相应的灰度共生矩阵。首先,通过计算灰度共生矩阵的协相关特征值,得到帧内预测的主参考方向,只将主参考方向范围内的角度预测模式添加到粗略模式选择候选列表中;然后,根据灰度共生矩阵的角二阶矩特征值和邻近块是否使用了深度建模模型,判断当前深度编码单元是否为平滑块,对平滑块省去将深度建模模型加入率失真候选列表中;最后,计算比较率失真候选列表中各候选模式的率失真代价,得到最终的帧内预测模式。该算法可以在保证编码效率的同时,有效降低深度图帧内编码的计算复杂度。针对3D-HEVC低帧率编码后的深度视频提出一种基于图割优化运动搜索的帧率提升算法。编号为奇数的访问单元中的深度图跳过不编码,解码后这些跳过的深度帧通过前后帧双向运动补偿的方式被插值重建出来。插值过程以编码单元树为基本单位进行,具体插值块的尺寸和搜索范围是根据对应纹理图的运动信息决定的。为了保证块间运动矢量场的平滑性,将一个编码单元树中所有块的运动搜索过程转化为一个全局能量最小化方程的求解,其中的匹配代价项是衡量插值块质量的合成视点失真。最后,利用图割优化算法来解决这个能量最小化方程,得到最终的运动矢量。用该算法重建出的深度图合成的虚拟中间视点,与用正常编码的深度图合成的虚拟视点相比,质量损失很小,同时可以节省码率和编码时间。
[Abstract]:3D video system can provide the audience with visual experience. In recent years, with the improvement of display technology, 3D video has been widely used, such as 3D film, free view TV, 3D family entertainment and virtual reality system. However, compared to traditional 2D video, 3D visual frequency often contains multiple views of texture and depth map, and In order to effectively compress 3D video data, the international standard organization set up a 3D video coding extension joint collaboration group to effectively compress 3D video data. A 3D extended version 3D- is introduced on the basis of the 2D efficient video coding standard (High Efficiency Video Coding, HEVC). HEVC, compared with previous multi view video coding standards, a higher compression ratio can be achieved with a higher compression ratio, which inherits the four fork tree partition structure of HEVC. By introducing some new technologies, it greatly improves the compression efficiency, but also increases the computational complexity. Therefore, how to make full use of the new features of 3D-HEVC to reduce its complexity and promote the implementation of the 3D-HEVC It is an important problem to be solved urgently. In this context, the study of 3D-HEVC fast algorithm is carried out, which are optimized for the size selection of the coding unit, the intra prediction in the depth map and the frame rate lifting of the depth map. In order to reduce the complexity of the tree division of the coding unit, the basic view of the 3D video is presented in this paper. A fast algorithm for size selection of coding units based on spatiotemporal and view point correlation is proposed. The first coded basic view, because there is no reference point, does not predict the parallax vector, so uses the spatio-temporal correlation of the four forked tree in the current view point to skip too large or too small size tasting. When the basic view coding reconstruction is completed, it is mapped to the non basic view of the code to be coded according to the three-dimensional spatial transformation relationship, and the hole marking graph will be generated in this process. For the depth video in the non basic view, the hole information in the corresponding position in the hole mark map can be terminated ahead of time. The recursive partition of the code unit tree; for the texture video in the non basic viewpoint, it can accelerate the division process of the coding unit tree by combining the void information and the correlation between the points of view. In view of the intra coding of the depth map, a fast intra prediction method based on the gray symbiotic matrix is proposed. First, the corresponding gray symbiotic matrix is generated before the intra prediction. First, the main reference direction of intra prediction is obtained by calculating the coassociated eigenvalues of the grayscale symbiotic matrix, and only the angle prediction model in the main reference direction is added to the rough pattern selection candidate list; then, according to the angle two order of the grayscale symbiotic matrix, The moment eigenvalue and the adjacent block use the depth modeling model to determine whether the current depth coding unit is a smooth block, and the depth modeling model is added to the rate distortion candidate list for the flat block. Finally, the rate distortion cost of the candidate models in the ratio distortion candidate list is calculated, and the final intra prediction model is obtained. It can effectively reduce the computational complexity of the intra coding of the depth map while guaranteeing the coding efficiency. A frame rate lifting algorithm based on the graph cut optimization motion search is proposed for the depth video encoded by the low frame rate of 3D-HEVC. The depth graph in the number of odd number access units is not coded, and the depth frames of these skipped after decoding are passed through the decoding. The method of bidirectional motion compensation for front and back frames is reconstructed by interpolation. The interpolation process is based on the coding unit tree. The size and search range of the interpolated block are determined by the motion information of the corresponding texture. In order to ensure the smoothness of the motion vector field between blocks, the motion of all blocks in a coding unit tree is searched. The process is transformed into a global energy minimization equation, in which the matching cost term is the synthetic viewpoint distortion that measures the quality of the interpolating block. Finally, a graph cut optimization algorithm is used to solve the energy minimization equation, and the final motion vector is obtained. The virtual intermediate point of view synthesized by the depth graph is rebuilt with the algorithm. Compared with the virtual view, the depth map of the code can reduce the quality loss and save code rate and coding time.
【学位授予单位】：浙江大学
【学位级别】：博士
【学位授予年份】：2017
【分类号】：TN919.81

【相似文献】