多视点深度图采集与质量评估方法研究

发布时间：2018-05-05 17:08

本文选题：多视点 + 深度图　；参考：《华中科技大学》2016年博士论文

【摘要】：随着视频技术的发展,能够带来沉浸式体验的3D视频(3D video,3DV)和能够实现观众与内容商之间交互的自由视点视频(free viewpoint video, FVV)引起了高度的重视和广泛的研究。此类新型视频系统中,利用多视点深度信息不但可以大大减少所需的视频数据量,还可实现任意视点绘制,提高系统的灵活性；因此,多视点深度信息具有十分基础又至关重要的作用。另一方面,深度值表征场景到成像平面的距离,它只能通过测量或者计算的方法获得,其中必然引入错误。因此,高质量深度图的获取及其质量评估成为3D视频和自由视点视频系统中意义重大的问题。本文围绕该课题展开了三个方面的研究。首先,本文研究了多台RGB-D相机对场景进行多视点深度图采集的方案,解决了多台深度相机之间干扰的问题；然后,考虑到现有RGB-D相机产品无法满足新的视频应用对深度图的需求,本文提出了一种基于相位结构光和主动立体匹配的混合式采集方案以获取深度图；最后,本文研究了深度图的质量评估问题,考虑到无失真的参考深度图往往不存在,本文提出了一种基于其物理意义的无参考方案实现深度值的错误检测和质量评估。首先,深度相机技术在最近几年发展迅速,以微软Kinect为代表的RGB-D相机可快捷地获取深度图。然而,RGB-D相机并没有多设备协同机制,当多台RGB-D相机采集同一场景的深度时会发生干扰,导致深度图质量下降。本文深入理解了干扰机制,分析了干扰的影响,进而提出了一种消除干扰以恢复深度值的方案。分析表明,RGB-D相机具有一定的鲁棒性,使干扰引起深度值丢失而非深度值变化,故可用有效深度值恢复出丢失的深度信息。另外,考虑深度图在物体内部和边缘的性质差异,本文进一步提出了区域自适应的深度值恢复方案。该方案首先以纹理图为参考,将深度图划分为平滑干扰区域和边界干扰区域。对平滑干扰区域,在梯度域建立马尔科夫随机场(Markov random field, MRF)恢复出梯度,进而利用离散泊松方程(discrete Poisson equation, DPE)恢复深度值。对边界区域,在深度值空间内建立纹理引导的MRF模型求取深度值。该方案在保留物体内部平滑性的同时,也维持了物体之间尖锐的边缘,保留了场景的几何信息。其次,3DV和FVV等新的视频应用对深度图提出了较高的要求。高质量深度图需要精确、稠密,通过单帧即可获取以便适用于动态场景,并可扩展到多视点深度图采集；而现有的RGB-D相机并不能良好地满足该需求。因此,本文以现有结构光测距方法为基础,提出了一种混合方案来获取高质量深度图。该方案提出一种基于条带的多频率正弦波模板,该模板具有正弦波模板可携带相位信息的特点,还具有良好的局部唯一性,从而使深度图可通过混合方案来获取。具体地,每个解码条带内的模板为正弦波,可利用傅里叶变换轮廓测定法(Fourier transform profilimetry, FTP)计算包裹相位,然后基于深度图的局部平滑性和解码条带之间多频率的性质准确而快速地进行相位展开,并转化成视差和深度值。此外,对不满足平滑性的区域,本文进一步利用模板的局部唯一性,通过空域立体匹配修正其深度值。实验结果表明,本文提出的方案对深度值跳变和空间孤立物体等复杂场景均能准确获得深度值,并可结合复用技术获取多视点下的深度图。然后,深度图通过测量和计算获得,其中错误难以避免,需要进行检测；另一方面,无失真的参考深度图并不存在,所以常见的全参考和部分参考的质量评估途径对深度图并不适用。针对该问题,本文提出了一种无参考的深度图质量评估方案。该方案以深度图的物理意义为理论依据,重点考察深度图边缘的几何扭曲。考虑到参考图像缺失,该方案从纹理图和深度图的相关性出发,对深度边缘和纹理边缘进行匹配。具体来说,纹理图和深度图是场景的两种表现形式,故二者的边缘具有很强的相关性。本文采用了边缘的空间位置、方向和长度为特征建立边缘的相似性度量,并以此实现边缘匹配；同时方案还采用了基于边缘线段的匹配方法以提高鲁棒性。最后,基于两类边缘线段之间匹配的结果,可确定深度图中的坏点并量化深度图质量。实验结果表明,该方案能准确检测到深度图的边缘失真并确定坏点；实验表明方案中提出的无参考质量指标和现有的全参考指标高度相关,同时也和虚拟视点的质量紧密相关。该错误检测方案还可用于深度错误校正和质量增强等后续工作。最后,本文对以上研究内容和创新点进行了总结,并结合视频技术发展的趋势,展望了本文的后续工作。本文工作是对多视点深度图采集和无参考条件下深度图质量评估的探索,为3DV和FVV等以深度图为基础的视频系统应用提供了研究思路和方案。
[Abstract]:With the development of video technology, the 3D video (3D video, 3DV) and the free viewpoint video (free viewpoint video, FVV) that can interact with the audience and the content merchants (FVV) have aroused great attention and extensive research. In this kind of new video system, the use of multi view depth information can not only greatly reduce the depth of the video. The amount of video data required can also be used to draw any view points and improve the flexibility of the system. Therefore, the depth information of multi view points has a very basic and vital role. On the other hand, the depth values can only be obtained by measuring or calculating the distance from the scene to the imaging plane, in which the errors are inevitably introduced. Therefore, the high quality is high quality. The acquisition of the quantity depth map and its quality evaluation have become a significant problem in the 3D video and the free view video system. This paper focuses on three aspects of this topic. Firstly, this paper studies the multi view depth map acquisition scheme of multiple RGB-D cameras for the scene, and solves the problem of interference between many deep cameras. In view of the fact that the existing RGB-D camera products are unable to meet the needs of the new video application to the depth map, a hybrid acquisition scheme based on the phase structure light and active stereo matching is proposed to obtain the depth map. Finally, the quality evaluation of the depth map is studied in this paper, considering that the undistorted reference depth map is often not stored. In this paper, a kind of error detection and quality evaluation of depth value based on its physical meaning is proposed. First, the depth camera technology has developed rapidly in recent years. The depth map can be quickly obtained by the RGB-D camera represented by Microsoft Kinect. However, the RGB-D camera has no multi device synergy mechanism, when multiple RGB-D cameras are used. When collecting the depth of the same scene, interference will occur and the quality of the depth map is reduced. This paper deeply understands the interference mechanism, analyzes the influence of interference, and then proposes a scheme to eliminate the interference to restore the depth value. The analysis shows that the RGB-D camera has a certain robustness, causing the interference to cause the depth value to be lost rather than the depth value. In addition, considering the difference in the nature of the interior and edge of the object, the depth recovery scheme of the region adaptive is further proposed in this paper. This scheme first uses the texture map as a reference to divide the depth map into a smooth interference region and a boundary interference region. The gradient is restored in the gradient domain (Markov random field, MRF), and then the depth value is restored by the discrete Poisson equation (discrete Poisson equation, DPE). The depth value of the texture guided MRF model is established in the boundary area in the depth value space. The scheme is also maintained at the same time, while preserving the interior smoothness of the object. The sharp edges between objects are held and the geometric information of the scene is retained. Secondly, new video applications such as 3DV and FVV have higher requirements for the depth map. The high quality depth map needs accurate and dense, can be obtained by single frame to apply to dynamic scenes and can be extended to multi view depth map collection; and the existing RGB-D cameras are Therefore, in this paper, a hybrid scheme is proposed to obtain a high quality depth map based on the existing structured light ranging method. A multi frequency sine wave template based on strip is proposed. The template has a sinusoidal template with the characteristics of carrying phase information, and has good local uniqueness. In particular, the template within each decoding strip is a sine wave, and the Fourier transform profilimetry (FTP) can be used to calculate the phase of the wrapping, then the local smoothness of the depth map and the properties of the multi frequency between the decoded bands are accurate and fast. In addition, in the area where the smoothness is not satisfied, this paper further uses the local uniqueness of the template to amend the depth value by spatial stereo matching. The experimental results show that the proposed scheme can obtain the depth value accurately for the complex scenes such as the depth jump and the space isolated body and so on. And then, the depth map is obtained under multi view points. Then, the depth map is obtained by measurement and calculation, in which the errors are difficult to avoid and need to be detected; on the other hand, the undistorted reference depth map does not exist, so the common reference and partial reference quality assessment approach does not apply to the depth map. This scheme is based on the physical meaning of the depth map, focusing on the geometric distortion of the edge of the depth map. Considering the lack of reference images, the scheme matches the correlation of the texture map and the depth map to the depth edge and the fringe edge. Specifically, texture and depth. The degree map is the two representation of the scene, so the edge of the two has a strong correlation. In this paper, the edge location, direction and length are used to establish the similarity measure of the edge, and the edge matching is realized. At the same time, the scheme also uses the matching method based on the edge line segment to improve the robustness. Finally, the two class is based on the two classes. The result of the matching between the edge lines can determine the bad points in the depth map and quantify the quality of the depth map. The experimental results show that the scheme can accurately detect the edge distortion of the depth map and determine the bad point; the experiment shows that the non reference quality index proposed in the scheme is highly correlated with the existing full reference index, and also the quality of the virtual view. The error detection scheme can also be used in the follow-up work of depth error correction and quality enhancement. Finally, this paper summarizes the above research content and innovation points, and combines the trend of video technology development, and looks forward to the follow-up work of this paper. This work is the acquisition of multi view depth map and the depth of no reference conditions. The exploration of graph quality assessment provides research ideas and solutions for the application of depth map based video systems such as 3DV and FVV.

【学位授予单位】：华中科技大学
【学位级别】：博士
【学位授予年份】：2016
【分类号】：TP391.41

【相似文献】