基于线性重建的表示学习及其在图像分析中的应用研究
本文选题:图像分析 + 人脸识别 ; 参考:《南京航空航天大学》2016年博士论文
【摘要】:随着信息时代的到来,计算机成为了现代人类社会生活中必不可少的信息处理工具。由于现代通信技术的进步和互联网的普及,图像日渐成为人们日常生活中接触最多的信息载体。相比于传统的文字载体,图像作为信息载体具有明显的优势:直观性——图像能够直接反应现场的情景;全面性—-图像能够全面和细致的重现场景;通用性—-图像不受国界和语言的影响;便捷性—-图像中的内容更加便于理解。因此,运用计算机实现图像的自动分析和处理成为社会智能化发展的基础。与文字的处理不同,图像分析和处理的自动化具有更大的挑战性。首先,人们缺乏运用计算机实现图像处理和分析的认识:人类对于文字的使用和处理具有数千年的历史,积累了丰富的经验,而数字图像的出现和计算机科学的发展最多也只有一百年的历史。如何运用和发展计算机科学技术,充分把握图像的特点提取其蕴含的信息,始终是计算机视觉发展的目标和动力。其次,人们缺乏运用计算机实现图像处理和分析的方法指导:目前,人们主要借鉴人类图像认知系统的运行方式来实现图像分析与处理的自动化,但是,人类的图像认知系统经过了几百万年的进化,是一个非常复杂的系统,而计算机科学的发展历史与其相比只是沧海一粟。如何有效的把握人类图像认知系统的实质,将其用于图像处理将会是一个长期摆在计算机视觉发展道路上的难题。从现有的图像表示方法上来看,图像特征表示的提取主要来自于两个方面:模仿人类图像认知器官的图像内部结构提取方法—-特征描述子,以及模仿人类神经系统处理图像的方式—-浅层和深层学习。以图像分析的一个基本应用—-人脸识别系统的设计为背景,从浅层学习入手,运用线性重建的方式改进了人脸识别系统中的人脸对齐和图像表示。具体来说,贡献如下:(1)提出了一种新的流形学习方法,并将其应用于人脸对齐问题。目前,浅层学习方法通常假设样本的空间结构是线性的。这种方式虽然降低了数据处理的复杂度,但是数据间的拓扑结构却被忽视了。事实上,高维数据通常具有一定流形结构,最为明显的例子就是人脸形状向量空间。然而,人脸对齐的参数模型中,形状模型依然假设人脸形状空间是线性的。流形学习作为一种非线性嵌入方法,能够有效的将高维数据通过非线性降维嵌入到流形空间,从而得到线性结构的数据,但是需要估计数据流形空间的维度,因此计算复杂度较大而无法满足实时性要求。通过平滑局部子流形,在局部切空间排列的基础上提出了一种新的流形学习方法。由于其显式的投影矩阵以及在原空间中的流形变换,使得它能够很好的与人脸对齐方法中形状模型相结合,从而将人脸形状的流形结构嵌入到模型中去。(2)提出了一种改进的空间非负矩阵分解方法。基于线性重建的表示学习方法中,非负矩阵分解是一种专门针对图像数据的特征学习方法。与以往的表示方法相比,非负矩阵分解的基图像具有更好的局部结构,因此非负矩阵分解作为一种基于局部的表示学习方法,其学到的图像表示向量具有更好的鲁棒性和可理解性。为了进一步改进基图像的局部性,对于非负矩阵分解的改进目前主要集中将图像的空间信息嵌入到基图像中。然而,这些空间信息通常来自于图像的二维网络结构,因此缺乏与数据内容的联系。对此,根据因子分析对图像特征之间关系的提取,提出了一种结合数据特征分布与空间结构信息的空间正则化方法,并将其与大间隔约束相结合,不但实现了空间结构的嵌入,判别性与局部性的融合,还降低了判别性约束和局部性约束对数据表示产生的矛盾影响。(3)提出了一种新的属性特征。与传统的特征描述子抽取的特征相比,属性是一种更高层次的特征,它所概括的不是图像中蕴含的某种几何结构,而是图像中某种语义信息的体现度。由于这种特点,属性特征对于人类来说具有更好的解释性。然而,对于语义的定义各不相同,而且很多语义是相对抽象的概念,因此属性的学习通常非常复杂和不准确。具体来说,对于连续属性的学习,需要对每个属性分别提取相应的特征并学习各自的属性分类器,以分类器的输出作为每个属性在样本上的体现度。对于选取哪些属性作为样本的特征,哪些特征最能体现每个属性以及属性分类器的设计,都会在不同程度上影响样本的属性质量。于是,基于心理学中的原型理论,提出了一种类相对关系属性—-原型相对属性。其中,每个属性分别体现了样本与已知各类的相关度,而不必在属性池中搜索问题相关的属性,同时,每个属性都使用相同的特征表示样本,因此在一定程度上简化了属性学习过程。
[Abstract]:With the arrival of the information age, computer has become an indispensable tool for information processing in modern human life. Because of the progress of modern communication technology and the popularity of the Internet, image has become the most important information carrier in people's daily life. Compared with the traditional carrier, the image is a carrier of information. Advantage: intuition - the image can react directly to the scene of the scene; comprehensiveness - the image can reproduce the scene in an all-round and meticulous way; universality - the image is not affected by the national boundary and the language; the convenience - the content in the image is more convenient to understand. Therefore, the automatic analysis and processing of the computer real image becomes social intelligentization. Different from the processing of words, the automation of image analysis and processing is more challenging. First, people lack the knowledge of computer image processing and analysis: the use and processing of words have a history of thousands of years, rich experience, and the appearance of digital images and computer science. It has a history of one hundred years at most. How to use and develop the computer science and technology to fully grasp the features of the image to extract the information contained in it is always the goal and motive of the development of computer vision. Secondly, people lack the guidance of computer image processing and analysis. At present, people are mainly drawing on human drawings. Like the operation mode of the cognitive system to automate the image analysis and processing, the human image cognition system has evolved for millions of years, it is a very complicated system, and the history of computer science is just a drop in the sea. How to effectively grasp the essence of human image cognitive system and use it Image processing will be a difficult problem on the road of computer vision for a long time. From the existing image representation method, the extraction of image feature representation mainly comes from two aspects: image internal structure extraction method imitating human image cognitive organs - feature descriptors, and imitation of human neural system processing map Based on the design of face recognition system, a basic application of image analysis, which is based on the design of face recognition system, improves face alignment and image representation in face recognition system by means of linear reconstruction. In particular, the contribution is as follows: (1) a new manifold learning method is proposed. At present, the shallow layer learning method is usually assumed that the spatial structure of the sample is linear. This method reduces the complexity of data processing, but the topology structure of the data is ignored. In fact, the high dimensional data usually has a certain manifold structure, and the most obvious example is the shape direction of the face. However, in the parameter model of face alignment, the shape model still assumes that the shape space of the face is linear. As a nonlinear embedding method, manifold learning can effectively embed the high dimensional data into the manifold space through the nonlinear dimensionality, thus obtaining the data of the linear structure, but it is necessary to estimate the dimension of the data manifold space. By smoothing the local submanifolds, a new manifold learning method is proposed on the basis of the local tangent space arrangement by smoothing the local submanifolds. Because of the explicit projection matrix and the manifold transformation in the original space, it can make it good for the shape model in the face alignment method. In addition, the manifold structure of the face shape is embedded into the model. (2) an improved spatial nonnegative matrix decomposition method is proposed. In the representation learning method of linear reconstruction, the nonnegative matrix decomposition is a characteristic learning method for image data. Compared with the previous representation method, the basic image of the nonnegative matrix decomposition is compared. It has better local structure, so the non negative matrix decomposition is a local representation learning method, and the image representation vector has better robustness and comprehensibility. In order to further improve the locality of the base image, the spatial information of the image is embedded in the base for the improvement of the non negative matrix decomposition. However, the spatial information is usually derived from the two-dimensional network structure of the image, and therefore lacks the connection with the data content. According to the factor analysis, the extraction of the relationship between image features is extracted. A spatial regularization method is proposed, which combines the feature distribution of the data and the spatial structure information, and combines it with the large interval constraints. It not only realizes the embedding of spatial structure, the fusion of discriminability and locality, but also reduces the conflicting effects of discriminative and local constraints on data representation. (3) a new attribute feature is proposed. Compared with the characteristics of traditional feature descriptor extraction, the attribute is a higher level feature, and it is not a graph. A certain geometric structure contained in the image, but a representation of some semantic information in the image. Because of this, attribute features have better interpretative properties for human beings. However, the definition of the semantics is different, and many semantics are relatively abstract concepts, so the learning of attributes is usually very complex and inaccurate. For the learning of continuous attributes, it is necessary to extract the corresponding characteristics of each attribute and learn their respective attribute classifiers, with the output of the classifier as the embodiment of each attribute on the sample. For which attributes are selected as the characteristics of the sample, which features are most capable of each attribute and the design of the attribute classifier, The quality of the samples is influenced to different degrees. Based on the prototype theory in psychology, a relative attribute of relative relation - the relative attribute of the prototype is proposed, in which each attribute reflects the correlation between the samples and the known types, and does not have to search the related attributes in the attribute pool, and each attribute uses the phase. The same features represent samples, thus simplifying the attribute learning process to a certain extent.
【学位授予单位】:南京航空航天大学
【学位级别】:博士
【学位授予年份】:2016
【分类号】:TP391.41
【相似文献】
相关期刊论文 前10条
1 贾宇峰,刘少君,郭尧君;双向电泳图像分析软件[J];现代科学仪器;2000年05期
2 徐培渝,张立实;图像分析系统在研究生毕业课题中的应用浅析[J];现代预防医学;2001年02期
3 薄立华,郭志良,崔亚南;生物医学图像分析工作者应具备的计算机基本技能[J];中国体视学与图像分析;2004年02期
4 ;2009图像分析与信号处理国际会议4月召开[J];中国印刷与包装研究;2009年01期
5 陈武凡;秦安;江少峰;冯前进;郝立巍;;医学图像分析的现状与展望[J];中国生物医学工程学报;2008年02期
6 ;2010图像分析与信号处理国际会议(英文)[J];智能系统学报;2009年04期
7 张宁;刘文萍;;基于图像分析的植物叶片识别技术综述[J];计算机应用研究;2011年11期
8 王刚;潘亮星;刘星舟;;利用公共图像分析软件计算金属晶粒大小的方法[J];现代商贸工业;2012年01期
9 邢彩虹,李桂兰,纪之莹,高耘,尹松年;单细胞凝胶电泳图像分析软件的比较[J];卫生毒理学杂志;2005年02期
10 温宏愿;赵琦;陈延如;周木春;张猛;许凌飞;;光谱图像分析用于转炉终点实时预测[J];光电工程;2008年05期
相关会议论文 前10条
1 刘国权;;体视学与图像分析基本原理及若干应用问题之讨论[A];第九届中国体视学与图像分析学术会议论文集[C];2001年
2 罗永刚;;医学图像分析的未来趋势[A];山东省计算机学会2005年信息技术与信息化研讨会论文集(二)[C];2005年
3 Roscoe Atkinson;;法医学与病理学中图像分析及病理信息学的潜能[A];法庭科学最新技术研讨培训班专家讲课提纲[C];2006年
4 张维琴;王淑云;李京晋;魏灵哲;张文义;;数据处理技术在粒度图像分析中的应用[A];第二届全国信息获取与处理学术会议论文集[C];2004年
5 李剑锋;李乃民;王宽全;张宏志;;面向舌图像分析的改进色度计算公式[A];第四次全国中西医结合诊断学术研讨会论文集[C];2010年
6 王晓民;胡文华;;生物医学图像分析视频采集设备的发展概况[A];第九届中国体视学与图像分析学术会议论文集[C];2001年
7 黎军英;徐颖;洪健;;JD-801形态学图像分析系统在TEM图像分析中的应用[A];2006年全国电子显微学会议论文集[C];2006年
8 李l勆,
本文编号:1884198
本文链接:https://www.wllwen.com/shoufeilunwen/xxkjbs/1884198.html