立体视觉媒体分析及处理技术研究
发布时间:2018-03-31 13:04
本文选题:双目立体媒体 切入点:深度计算 出处:《南京大学》2017年博士论文
【摘要】:VR、AR、IMAX3D等成为近年来人们耳熟能详的热点词汇,究其原因,主要是由于基于立体视觉媒体获取设备的大量普及以及立体媒体数量的激增,让更多人有机会了解、使用、研究立体媒体。尽管立体媒体的表达方式多样,本文主要对其中模仿人眼方式记录信息的双目立体媒体,展开内容分析和处理方面的研究。同传统多媒体信息处理技术相比,立体媒体处理技术的关键在于对双目视角之间区别和联系关系的挖掘和利用。来自于平行视角之间的对立统一关系,既为内容处理增加了更多线索,同时也增加了更多干扰,因而探索结合媒体新特性的新方法,才能切实提高立体媒体处理的质量和效率。针对立体媒体内容分析领域几个关键性基础问题,在总结国内外研究现状的基础上,分析了存在的主要问题,并给出相应的解决方案。同时,对相关处理技术进行了深入探索。其中主要的创新点和贡献包括如下几个方面:1.提出了一种立体视频深度快速估计方法,利用视频帧间冗余信息,通过自适应运动插值,显著提高计算效率,同时保证深度序列时域连续性。现有立体媒体深度计算方法大多建立在双目图像立体匹配的基础之上,此类方法通常需要设置合适的视差范围,方能达到最佳计算效果,因而直接迁移到立体视频上易造成深度序列不连续等现象。已有针对立体视频的深度计算方法,为确保时域深度的连续性,需要引入大量全局优化过程,因而计算效率很难得到保障。本文通过分析立体视频特性,将细粒度深度计算和粗粒度深度估计通过运动矢量有机结合,提出了一种基于运动插值的深度快速估计方法。该方法不仅在精度上可以媲美全局优化方法,在计算效率上更可以节省一半以上计算时间。2.提出了一种多对象似物性推荐方法,通过构建基于上下文感知的多对象似物性推荐模型,有效解决了逐帧似物性推荐所带来的推荐不一致、计算冗余等问题。现有似物性推荐研究多集中于图像,而针对视频的工作大多开始于图像方法的逐帧使用,且主要面向运动物体或者显著物体推荐。实验表明,逐帧似物性推荐,不仅存在计算冗余,更重要的是其在时域上物体推荐结果易出现不一致性。为解决这些问题,本文提出了一种基于上下文感知的多对象似物性推荐方法,通过设置自适应映射策略,把空域似物性推荐和时域似物性推荐有机结合,为优秀的似物性推荐研究成果应用于视频中提供了通用且有效的解决方案。此外,针对目前缺少视频多对象似物性推荐数据集的现状,构建了一个平均物体数量达3.34的视频多物体数据集,以推动本领域的相关研究。3.提出了一种基于视角融合的多显著对象检测方法,有效利用不同视角之间物体检测的不一致性,进一步提升了显著物体检测的精度。目前显著对象检测主要基于场景中只有一个显著对象的假设,有关多显著对象检测的问题,尚未形成规模性研究,并且已有和多显著对象相关的工作也主要在单目图像上开展。实验表明,单目图像多显著对象检测方法作用于双目图像时,易出现不同视角之间物体推荐不一致的现象。针对这一问题,本文提出了一种基于视角融合的多显著对象检测方法,通过探讨平行视角间显著物体框之间的关系,采用显著性和似物性双概率估计的策略,对显著物体框的打分进行精化,从而提升最终多显著物体检测的准确性和精度。4.提出了一种平面动态立体感的展示方法,服务于广泛存在的立体图像,为实现立体图像裸眼3D提供了新思路。如果没有硬件辅助设备,存在于互联网等处的立体图像无法在普通显示器上展示立体感的现象,是阻碍立体图像进一步普及化的瓶颈。由于当前一些利用运动视差的平面3D动态展示方法缺乏对人眼感知立体的完整分析和建模,易造成展示结果存在闪烁、观看不适等问题。本文通过对人眼视觉系统、运动视差、视觉暂留等现象的分析,提出了一种基于平面显示设备的立体图像动态展示方法,将立体图像的3D感成功传递给用户,为立体图像的进一步发展创造了更多可能。5.提出了一种对立体视频进行重对焦的方法,通过构建计算摄影模型,营造类单反拍摄的重对焦效果。现有的立体视频主要为电影院、VR/AR设备服务,很难在普通用户生活中寻其踪迹。事实上,利用立体视频所隐含的深度信息,可以对视频内容实现更为丰富的内容处理。仅依靠软件方式实现视频重对焦,其输出结果很难摆脱人工处理痕迹。本文基于对摄影学中焦平面、景深、弥散圆等概念的理解,构建面向立体视频重对焦的计算摄影模型,实现类单反效果的视频重对焦,服务于普通用户。在以上关键技术和内容处理的基础上,本文还给出了对未来一些研究方向的展望,展示了本文研究内容的系统性和延展性,以及对相关研究领域的支撑作用,同时也说明本文研究成果在立体媒体研究领域具有良好的应用前景。
[Abstract]:VR, AR, IMAX3D has become a hot word in recent years the people the reason for having heard it many times, mainly due to a surge in the number of universal access to equipment based on stereo vision media and three-dimensional media, so that more people have the opportunity to learn, use, research on stereo media. Despite the expression of three-dimensional media diversity, this paper focuses on the binocular stereo media which mimics the recorded information of human way, carry out research content analysis and processing. Compared with the traditional multimedia information processing technology, three-dimensional media processing technology is the key to the mining of binocular visual angle between the difference and the relation between the unity of opposites. And from the perspective of the relationship between parallel, both for the content increased more clues, while also adding more interference, and to explore a new method of combining the new media features, in order to effectively improve the quality and efficiency of stereoscopic media processing Rate according to three-dimensional media content analysis field of several key basic problems, based on summarizing the domestic and foreign research status, analysis of the main problems, and gives the corresponding solutions. At the same time, in-depth exploration of the related processing technology. The main contributions are as follows: 1. put forward a fast stereo video depth estimation method, using the redundant information between the video frames, the motion adaptive interpolation, significantly improve the computational efficiency, at the same time to ensure the depth of time-domain sequence continuity. The existing stereo media depth calculation method based on the most in binocular stereo matching, this method usually need to set the appropriate to the disparity range. To achieve the best results, thus directly migrate to the stereo video easily caused by depth sequence discontinuous phenomena. According to the existing depth calculation of stereo video In order to ensure the continuity of the time domain method, the depth of the need to introduce a large number of global optimization process, so the computation efficiency is guaranteed. By analyzing the characteristics of stereo video, the fine-grained and coarse-grained depth calculation depth estimation by combining motion vector, proposed a motion interpolation based depth estimation method. This method is not only fast can the accuracy comparable to global optimization method, the computation efficiency can save more than half the computing time of.2. proposed a multi object analogues of recommendation method, by constructing multi object based on context awareness like material recommendation model can effectively solve the frame caused by the physical properties like recommended inconsistent calculation redundancy and other issues. The existing like material research focused on the recommendation for the video image, and most of the work started in the image frame and method of use, mainly for moving objects Or recommend significant objects. Experimental results show that the frame like properties recommended, not only exist redundant calculation, more important is the object in the time domain recommendation results prone to inconsistency. In order to solve these problems, this paper proposes a multi object based on context awareness like material recommendation method, by setting the adaptive mapping strategy the spatial properties, like the recommended and time domain analogs recommended combination, a general solution for the excellent and effective analogue recommendation research is applied to video. In addition, according to the present situation of the lack of video object like objects of recommended data sets, build a number of average objects up to 3.34 the video object data set, to promote research in the field of.3. presents a significant object detection method based on the fusion of the effective use of perspective, different perspectives between the object detection is not consistent, in Further enhance the saliency object detection accuracy. Currently significant object detection is mainly based on the scene only a significant object hypothesis, the more salient object detection problem, has not yet formed a large-scale study, and there are many and significant object related work mainly in monocular image. Experiments show that monocular image multiple salient objects detection method on the binocular image, prone to objects from different perspectives between recommended inconsistencies. Aiming at this problem, this paper proposes a multi object detection method was based on the perspective of integration through, to explore the relationship between the angle between parallel salient object frame, with significant and similar physical property estimation double probability strategy, refinement of the salient object frame rate, so as to enhance the accuracy and precision of the final.4. significant object detection presents a plane dynamic stereoscopic display 鏂规硶,鏈嶅姟浜庡箍娉涘瓨鍦ㄧ殑绔嬩綋鍥惧儚,涓哄疄鐜扮珛浣撳浘鍍忚8鐪,
本文编号:1690856
本文链接:https://www.wllwen.com/shoufeilunwen/xxkjbs/1690856.html