可分级的多视点立体视频编码及传输关键算法的研究

发布时间：2018-06-08 21:22

本文选题：可分级 + 多视点　；参考：《武汉大学》2014年博士论文

【摘要】：随着计算机硬件和立体显示技术的迅速发展,多视点立体视频逐渐进入我们的生活。多视点立体视频包括不同视点视频的整合应用,使用者能除了能感受到立体的场景之外,它也提供收视者与播放系统间的互动性。多视点立体视频系统不同于以往的视频特性,为了使收视者能感觉到画面深度的变化,通过相关的播放系统,能够将拍摄的场景从各种角度完整忠实呈现给收视者。利用多视点立体视频多视角以及立体影像重现的特性,收视者能够体验到不同于以往的视觉效果。相对于传统视频而言,多视点立体视频的数据更为庞大,且各视点间存在极大的相似性,因此开发高效率的多视点立体视频压缩技术便显得格外重要。为了达到任意视点的立体视频交互响应,各视点视频资料以及交互式呈现系统所需的视差信息以及切换信息都是不可或缺的,如何有效的压缩这些信息将是未来多视点立体视频压缩技术的研究重点。在研究多视点立体视频编码的同时,客户端硬件环境和网络环境也需要同时考虑。尤其近来随着手机运算能力的提高,透过手机观赏数字电视已不再是梦想,移动影音服务的质量逐渐成为备受重视的新领域。但因为网络环境及封包传输行为的不同,必须依据其特性分别加以处理,如何能够随着现有带宽状况快速调整多视点视频当前码率,以增进系统效能与影音服务质量便成为重要的课题。本文根据多视点立体视频的特点,通过多视点立体视频其视点间高度相关性,研究了如何快速提取多视点立体视频的深度图,并且优化视点间互相参考编码的参考架构,降低视点间的冗余计算,提高多视点视频的编码效率。然后提出了可分级多视点立体视频编码的并行架构,最后对该编码方案在网络上的传输策略进行了研究,使视频数据流具备快速随机切换能力,提高了多视点立体视频的应用范围。本文的主要研究工作如下： (1)针对多视点立体视频中现有的深度估计准确性差的问题,本文研究了如何快速准确的提取多视点立体视频的深度图用于后续编码算法。包括如何在多视点立体视频的多个视点的图像上找出尺度不变的特征点,并建立特征点的对应关系,然后利用特征点的关系使用自适应的匹配算法建立局部匹配。综合运用了光流法、马尔可夫随机场等算法,可以让初始深度图在视觉上更舒适和保留细节,并且在时间域上形成连续的深度结果,减少弱纹理区域的错误及异常值。 (2)根据不同视点图像彼此的相关性和深度信息提出了快速模式选择算法。根据邻近视点图像压缩时选择的模式来减少每个宏块需要计算的模式数量,结合以基于阈值的提前中止模式选择机制来进一步的加速模式选择的流程。同时利用运动矢量补偿的结果去决定是否需要计算视差矢量补偿,分别减少运动矢量补偿和视差矢量补偿的搜索范围。(3)在基于可分级视频和多视点视频的并行化算法的研究基础之上,提出了可分级多视点视频并行化编码算法。针对可分级多视点立体视频编码的存取单元、预测顺序、编码结构进行了研究,使可分级多视点立体视频编码能获得根据视点的数量和扩展层的数量调整编码的能力,‘并综合提高了编码性能。(4)由于在可分级多视点立体视频的播放过程中,客户端可以自由的选择不同的视点或者码率。为了在不增加太多存储空间但又能提高视频的切换速度的情况下,本文研究了通过SP帧使视频传输支持无缝切换的多层可分级视频编码方法。该方法根据多视点立体视频的特点,同时具备非可分级视频编码的高编码效率和可分级视频编码方法的灵活性,从而在更宽的带宽范围内获得了更好的流化视频服务质量,并且减少了视频占用存储空间的大小。综上,本文以可分级多视点立体视频的编码效率为目标,在视点间相关性的基础上,分别对深度图生成、模式选择、快速估计、并行编码和码流切换等问题进行了深入研究,并给出了相关的仿真实验与性能分析结果。
[Abstract]:With the rapid development of computer hardware and stereoscopic display technology , multi - viewpoint stereoscopic video gradually enters our life . Multi - view stereoscopic video includes the integration application of different viewpoint videos , and the multi - view stereoscopic video system can also provide the interactivity between the viewer and the playing system .

With respect to the traditional video , the data of multi - view stereoscopic video is much larger , and there is great similarity among the viewpoints , so it is very important to develop high - efficiency multi - view stereo video compression technology . In order to achieve the stereoscopic video interactive response of arbitrary viewpoint , the parallax information and the switching information needed by each viewpoint video data and interactive presentation system are indispensable , and how to effectively compress these information will be the focus of future multi - viewpoint stereo video compression technology .

While studying multi - viewpoint stereo video coding , the client hardware environment and the network environment need to be considered at the same time . Especially recently with the improvement of mobile phone computing power , it is no longer a dream to watch digital TV through mobile phone . However , because of the difference of network environment and packet transmission behavior , it is necessary to deal with the characteristics of network environment and packet transmission , how to adjust the current code rate of multi - view video rapidly with the existing bandwidth condition , so as to improve system performance and video quality service quality .

In this paper , based on the feature of multi - view stereo video , how to extract the depth map of multi - view stereo video is studied by multi - viewpoint stereo video , and the reference architecture of multi - view stereo video coding is optimized , the coding efficiency of multi - view video is improved . Finally , a scalable multi - view stereo video coding parallel architecture is proposed . Finally , a scalable multi - view stereo video coding scheme is proposed , which makes the video data stream have fast random switching capability and improves the application scope of multi - view stereo video .

( 1 ) Aiming at the problem of poor accuracy of existing depth estimation in multi - viewpoint stereo video , this paper studied how to extract the depth map of multi - view stereo video quickly and accurately for the subsequent coding algorithm .

( 2 ) A fast mode selection algorithm is proposed according to the correlation and depth information of different viewpoint images with each other .

the result of motion vector compensation determines whether to calculate disparity vector compensation , respectively , reduce motion vector compensation ,

and ( 3 ) based on the research of parallelizing algorithm based on scalable video and multi - view video ,

The invention relates to a scalable multi - view video parallel coding algorithm ,

Meta , prediction order and coding structure are studied , so that scalable multi - view stereo video coding can be obtained .

the number of viewpoints and the number of spreading layers adjust the coding ability , ' and the coding performance is improved comprehensively . ( 4 ) the client can freely choose because the number of viewpoints can be freely selected during the playing process of the scalable multi - view stereoscopic video in order to improve the switching speed of the video without adding too much storage space , In this paper , the multi - layer scalable video coding for seamless switching of video transmission through SP frames is studied in this paper . The method is based on the characteristics of multi - view stereoscopic video and simultaneously has high coding of non - scalable video coding . Efficiency and scalability of scalable video coding methods to get better streams over a wider bandwidth range In this paper , based on the coding efficiency of scalable multi - view stereo video , this paper focuses on the correlation among the viewpoints . On the base , the problems such as depth map generation , mode selection , fast estimation , parallel encoding and code stream switching are discussed . The results of simulation experiments and performance analysis are also given .
【学位授予单位】：武汉大学
【学位级别】：博士
【学位授予年份】：2014
【分类号】：TN919.81

【参考文献】