面向增强现实的实时三维跟踪
发布时间:2018-06-30 01:18
本文选题:实时 + 增强现实 ; 参考:《浙江大学》2010年博士论文
【摘要】:随着计算机运算能力的不断增强,计算机视觉研究得到了持续的发展,在监控、检索、识别、导航、医疗、教育等领域的应用为人的视觉提供有效的补充,甚至在某些方面很好地替代了人的视觉。虚实混合作为计算机视觉的重要应用之一,是通过特殊的设备,将计算机产生的虚拟信息与现实环境无缝融合,给人们提供额外的信息,如说明文字、视频教程、三维动画等。 本文涉及的增强现实是虚实混合技术的一种,利用相关计算机视觉技术分析现实场景中的物体和环境特征,并在指定的位置绘制计算机生成的附属信息,帮助人们更好地理解场景。一般的增强现实系统,包括视频输入、特征分析、摄像机定位、虚实融合等模块,其中特征分析和摄像机定位是最核心的模块。离线增强现实已经在电影工业,视频广告中得到广泛应用,然而实时增强现实更多地处在实验阶段。本文主要研究实时增强现实的三维跟踪技术,即实时地恢复摄像机与场景之间的相对空间方位,内容包括多线程技术框架设计、图像特征分析、大规模场景的关键帧表达、和纯旋转相机下的双层分割方法。总体来说,本文希望能促进实时三维跟踪技术在增强现实中的应用,主要贡献在以下几个方面。 ·提出统一的实时增强现实系统框架。在关键技术充分模块化,模块接口标准化的基础上,将各种现实环境下的增强现实统一在一个多线程并行框架里,用户可以便捷地在此基础上开发新的增强现实应用,而且这个框架充分利用了多核机器的计算能力,使系统在适应各种复杂环境的情况下保证高效可靠的性能。 ·提出改进的基准标志系统。在一些桌面增强现实应用中,系统不能从自然场景中提取足够的特征定位摄像机,必须辅以基准标志。本文提出的基准标志是包围在黑色方框中的汉字图像。为了在复杂的光影下也能稳定地检测出标志,系统利用边缘信息检测标志的包围框。同时,本文将汉字的结构表达为汉字轮廓到边框的距离场,增加了汉字标志的可识别度。另一方面,传统的基准标志一般是黑白图案,从视觉上看很不美观,本文于是利用自然图像作为基准标志的补充。 提出基于关键帧的场景表达和快速选择候选关键帧方法。在大规模自然场景中,系统利用Structure-from-Motion技术从预处理视频序列中恢复场景的稀疏三维点云。由于大规模场景的特征过于丰富,特征匹配在时间和数量上的性能都会明显下降。本文通过贪婪优化方法,从输入预处理视频序列中自动选择一些关键帧,这些关键帧将包含比较稳定的,特征明显的三维特征点。三维跟踪通过图像识别算法,为输入图像选择相似的候选关键帧,然后只跟候选关键帧进行特征匹配。为了获得更稳定的跟踪结果,本文还利用极线约束,连续帧跟踪等方法匹配更多特征点。 提出摄像机纯旋转运动下,快速稳定地分离前景背景物体的方法。由于场景和摄像机运动的双重复杂性,场景层次分割是处理增强现实中的虚实遮挡的重要方法,同时也是非常难解决的问题。本文尝试解决在摄像机只有旋转运动情况下前背景之间的遮挡,这事实上是一个前背景分割问题。系统首先建立背景的全景图,然后将实时输入图像与背景全景图配准,估计背景信息,并利用图割算法进行分割。针对复杂背景和背景配准误差,本文结合过分割方法对背景全景图建立局部颜色模型,同时压制背景的颜色反差信息。系统得到精确的分割结果,并实现了一系列特殊的增强现实效果。
[Abstract]:With the continuous enhancement of computer computing power, computer vision research has been developed continuously. Applications in monitoring, retrieval, identification, navigation, medical, education and other fields provide an effective complement to human vision, and even in some ways, it is a good substitute for human vision. The mixture of virtual and real is one of the important applications of computer vision. Through the special equipment, the virtual information produced by the computer is fused seamlessly with the real environment to provide people with additional information, such as the description of the text, the video course, the 3D animation, etc.
The augmented reality in this paper is a kind of virtual and real mixing technology, using the related computer vision technology to analyze the object and environment features in the real scene, and draw computer generated auxiliary information in the specified position to help people to better understand the scene. General enhancement of the real system, including video input, feature analysis, camera Location, virtual fusion and other modules, feature analysis and camera positioning are the core modules. Off-line augmented reality has been widely used in film industry and video advertising. However, real time augmented reality is more in the experimental stage. This paper mainly studies real-time enhancement of real 3D tracking technology, that is, real-time recovery of cameras. The relative spatial orientation between the scene and the scene includes the multi thread technology framework design, the image feature analysis, the key frame expression of the large-scale scene, and the double decker segmentation method under the pure rotation camera. In general, this paper hopes to promote the real-time 3D tracking technology to enhance the application in the present. The main contributions are in the following aspects.
The framework of a unified real time augmented reality system is proposed. On the basis of full modularization of key technologies and standardization of module interfaces, the augmented reality under various realistic environments is unified in a multi thread parallel framework, and users can easily develop new enhanced applications on this basis, and this framework makes full use of multi-core frameworks. The computing power of the machine enables the system to ensure efficient and reliable performance in a variety of complex environments.
In some desktop augmented reality applications, the system can not extract sufficient feature location cameras from natural scenes and must be supplemented with reference marks. The reference mark presented in this paper is a Chinese character image enclosed in a black box. In order to detect the signs steadily in a complex light and shadow, the system can also be detected. At the same time, the structure of the Chinese character is expressed as the distance field of the Chinese character outline to the border, and the recognition of the Chinese character marks is increased. On the other hand, the traditional benchmark is generally black and white, and it is not beautiful from the vision. So the natural image is used as the supplement of the reference mark.
The method of scene expression based on key frame and fast selection of candidate key frame is proposed. In large-scale natural scene, the system uses Structure-from-Motion technology to recover the sparse 3D point cloud from the preprocessed video sequence. Because the feature of large-scale scene is too rich, the performance of feature matching in time and quantity will be obvious. In this paper, through greedy optimization, some key frames are automatically selected from the input preprocessed video sequence. These key frames will contain more stable and characteristic 3D feature points. Three dimensional tracking is used to select similar candidate key frames for the input image by image recognition algorithm, and then only match the feature of candidate key frames. In order to get more stable tracking results, we also use polar line constraint and continuous frame tracking to match more feature points.
The method of fast and stable separation of foreground objects is proposed under the pure rotation of the camera. Because of the dual complexity of the scene and camera motion, the scene hierarchy segmentation is an important method to deal with the virtual reality in the augmented reality. It is also a very difficult problem to solve. This paper tries to solve the problem that the camera has only rotation motion. In fact, the occlusion between the front background is a front background segmentation problem. The system first sets up the panorama of the background, then registers the real time input image with the background panorama, estimates the background information, and uses the graph cutting algorithm to divide it. The paper combines the error of the complex background and background registration. This paper combines the over segmentation method to build the background panorama. A local color model is established to suppress background color contrast information. The system achieves accurate segmentation results and achieves a series of special augmented reality effects.
【学位授予单位】:浙江大学
【学位级别】:博士
【学位授予年份】:2010
【分类号】:TP391.41
【引证文献】
相关硕士学位论文 前1条
1 陈烁;增强现实中实时跟踪技术的研究[D];沈阳工业大学;2014年
,本文编号:2084215
本文链接:https://www.wllwen.com/wenyilunwen/guanggaoshejilunwen/2084215.html