基于特征融合的视觉关注算法研究
本文选题:视觉关注 + 特征融合 ; 参考:《中国矿业大学(北京)》2017年博士论文
【摘要】:视觉关注是计算机视觉领域的重要研究内容之一,是指利用模式识别、机器学习等分析方法预测实验对象关注的感兴趣目标或者方向。基于特征融合的视觉关注算法是指通过特征提取和融合的方式构建头部特征矩阵,并计算头部姿态信息或者凝视方向信息,最终确定视觉关注的目标或者方向。近年来视觉关注算法在公共安全、自然会议和辅助驾驶等诸多领域得到广泛应用。虽然大量研究人员对基于人脸特征的视觉关注算法进行了大量研究,但是仍然存在许多问题,主要表现在三个方面:(1)局部特征与全局特征的表达不平衡问题。通常,特征融合方法选取整幅图片的各种特征进行加权融合,仅考虑了特征融合的全局有效性而造成局部特征表达不充分,或者是仅考虑局部特征,采用多种方法提取局部特征,产生全局特征表达过于复杂的问题。同一图像不同区域的显著特征各异,从全局提取特征多种特征容易造成全局特征计算复杂度高;而提取少量的特征会引起局部特征信息表达不充分。为了高效的提取到尽可能充分的局部特征,降低全局特征的计算复杂度,需要综合考虑局部特征与全局特征的平衡有效表达。(2)头部姿态表达复杂计算效率低下的问题。头部姿态是视觉关注技术的核心组成部分。准确的头部姿态估计可以高效的推动视觉关注目标预测和跟踪。头部姿态估计方法包括基于外观模型的、基于几何模型的和基于特征表达的三大类。基于特征表达的方法容易被外界环境干扰如头部配饰、头部位置变化;基于外观模型的方法需要训练大量头部数据,并且需要将姿态信息在训练样本中进行准确标注;基于几何模型的方法实时性高,受到相机标定参数、图像分辨率的严格限制,另外单个摄像头无法获得深度信息,即使准确率在达到了像素级,仍然存在5°左右的姿态角度误差。为了准确表达并高效计算头部姿态,需要构造高效简洁的头部姿态特征矩阵和头部姿态计算方法。(3)视觉关注中头部姿态有与凝视方向的歧义性问题。凝视方向与头部姿态是视觉关注算法研究的两个核心内容,二者相辅相成,缺一不可。单一的头部姿态或者凝视方向并不能准确表达人的视觉关注状态。在同一头部朝向范围内存在多个潜在的关注目标,需要结合凝视方向才能准确锁定视觉关注目标;此外,在头部朝向确定的条件下,存在凝视偏移,即视觉关注目标正在发生变化。目前对于视觉关注的研究往往集中于头部姿态分析或者凝视方向估计两个独立的方面,并没有达到缓解头部朝向与凝视方向歧义的目的。因此,综合考虑头部姿态与凝视方向之间的关系,能够缓解头部朝向和凝视方向歧义问题的视觉关注算法亟待提出。由于局部特征与全局特征的不平衡性、头部姿态表达的复杂性和计算的低时效性、头部朝向与凝视方向的歧义性,基于特征融合的视觉关注算法仍是艰难并富有挑战的研究课题。针对以上问题本文进行了以下三个方面的研究工作。(1)局部特征与全局特征的表达平衡性。在特征融合方面,为了提高局部特征表达充分性,降低全局特征表达复杂性,达到局部特征与全局特征的平衡性,本文构建了基于信息熵的局部特征提取框架,提出了用于头部姿态估计的加权熵融合的Gabor和Phase Congruency头部特征矩阵。首先,根据信息熵理论衡量图像局部特征的重要程度,确定何种特征能充分的表达该区域的原始信息;然后将所有的局部特征以简洁方式联结构成全局特征矩阵;最后,通过公开的人脸数据集、头部数据集使用机器学习分类器和回归器进行验证,说明文中提出的加权信息熵融合的头部姿态特征矩阵结合相应的监督学习方法在头部姿态的分类性能优于常用全局特征融合矩阵。(2)头部姿态的准确表达和高效计算。在头部姿态表达和计算面,为了提高头部姿表达准确性,提升头部姿态计算的时效性,本文提出了基于深度信息重建的头部姿态估计算法以及改进的加权版本。首先提取头部的LBP(Local binary pattern,LBP)特征构建Adaboost-LBP人脸分类器;然后根据相机成像原理重建深度信息,根据深度信息及目标与相机之间的几何关系利用基于深度信息重建的头部姿态估计算法计算头部姿态。为了提高构建深度信息的精确度,使用ASM(Active shape model,ASM)方法提取提取68点人脸轮廓模型,构建加权深度信息重建算法;最后使用优化后的深度信息结合头部特征及外观模型对视觉关注场景中的头部姿态进行实验,说明本文提出的基于深度信息重建的头部姿态估计算法和其改进的加权版本在头部姿态表达准确性和计算性能两方面优于常用的头部姿态估计方法。(3)头部姿态与凝视方向的歧义性。视觉关注领域包括头部姿态与凝视方向两方面的研究工作。单一的头部姿态可以对应多个凝视方向,同一个凝视方向也可以处于不同的头部姿态条件下。因此,用头部姿态或者凝视方向来描述视觉关注会产生歧义。为了缓解视觉关注领域头部姿态与凝视方向的歧义性,本文提出了凝视辅助的HMM(Hidden Markov Model,HMM)结合的视觉关注算法。首先,通过深度卷积神经网络学习获得头部数据并计算头部姿态和凝视方向;然后,通过HMM将凝视方向与头部姿态结合预测视觉关注方向或者目标;最后,使用公开头部姿态数据集和实时视频数据进行实验分析,说明本文提出的凝视辅助的视觉关注算法方法在一定程度上削弱了视觉关注歧义性,能够提高视觉关注目标预测准确率。通过公共数据集和视频数据的同构异构数据验证,得出了以下结论:(1)采用加权信息熵特征融合框架对Gabor特征和Phase Congruency特征进行融合,构建的头部姿态特征矩阵,既充分表达了头部局部特征,也降低了全局特征的复杂性,达到了局部与全局的平衡,提高了头部姿态估计算法分类精度与时效。(2)提出的基于深度信息重建的头部姿态估计算法和其改进后的加权版本,准确地重建了深度信息,提高了头部姿态表达的准确性和姿态估计时效性。(3)提出的凝视辅助的视觉关注算法,通过HMM将凝视方向与头部姿态结合预测视觉关注方向或者目标,缓解了视觉关注算法中头部朝向与凝视方向的歧义性,降低了视觉关注的误差。
[Abstract]:Visual attention is one of the important research contents in the field of computer vision. It refers to the use of pattern recognition, machine learning and other analytical methods to predict the interest target or direction of the experimental object. The visual attention algorithm based on feature fusion refers to the construction of the head feature matrix by feature extraction and fusion, and the calculation of the head posture. In recent years, visual attention algorithms have been widely used in many fields, such as public security, natural meeting and auxiliary driving. Although a large number of researchers have done a lot of research on visual attention algorithms based on face features, there are still many problems. It is mainly manifested in three aspects: (1) the problem of unbalanced expression of local and global features. Usually, the feature fusion method selects the various features of the whole picture to carry on the weighted fusion, only considering the global validity of the feature fusion, resulting in inadequate expression of the local features, or only considering the local features, and using a variety of methods to extract the bureaus. The features of the same image are too complex to express the characteristics of the same image. The distinct features of the different regions of the same image are different. It is easy to extract the features from the global feature to cause the high complexity of the global feature, and the extraction of a small number of features will cause insufficient local feature information to be expressed. In order to reduce the computational complexity of global features, it is necessary to consider the balanced and effective representation of local and global features. (2) the problem of low computational complexity in the expression of head attitude. Head pose is the core component of visual attention technology. The accurate head attitude estimation can efficiently promote the vision and tracking of visual attention. The head attitude estimation method includes three categories based on the appearance model, the geometric model and the feature based expression. The method based on the feature expression is easily disturbed by the external environment, such as the head accessories and the head position. The method based on the appearance model needs to train a large number of head data, and the attitude information needs to be in the training sample. Based on the geometric model, the method has high real-time performance, the camera calibration parameters and the image resolution are strictly limited. In addition, a single camera can not obtain depth information. Even if the accuracy rate reaches the pixel level, there is still a attitude angle error of about 5 degrees. In order to accurately express and efficiently calculate the head posture, it needs to be constructed. The high efficient and concise head attitude feature matrix and head attitude calculation method. (3) the head posture has the ambiguity problem with the gaze direction in the visual attention. The gaze direction and the head pose are the two core contents of the visual attention algorithm research. The two are complementary and indispensable. The single head posture or the gaze direction is not accurate. There are a number of potential attention targets in the same direction of the same head. It is necessary to lock the visual attention target with the direction of the gaze. In addition, there is a gaze shift under the condition of the head orientation, that is, the visual attention is changing. Research on visual attention is often focused on the focus of visual attention. Two independent aspects of head attitude analysis or gaze direction estimation do not achieve the purpose of alleviating the ambiguity of head orientation and gaze direction. Therefore, considering the relationship between head attitude and gaze direction, the visual attention algorithm which can alleviate the ambiguity problem of head orientation and gaze direction needs to be put forward urgently. The unbalance of global features, the complexity of the expression of the head attitude, the low timeliness of the computing, the ambiguity of the direction of the head and the direction of the gaze, the visual attention algorithm based on the feature fusion is still a difficult and challenging research topic. In this paper, the following three aspects are studied. (1) local features and global characteristics In the aspect of feature fusion, in order to improve the expression of local features, reduce the complexity of global feature expression and achieve the balance of local features and global features, this paper constructs a local feature extraction framework based on information entropy, and proposes a weighted entropy fusion Gabor and Phase Congruenc for head attitude estimation. Y head feature matrix. First, according to the information entropy theory to measure the importance of the local feature of the image, and determine what features can fully express the original information of the region, and then combine all the local features in a concise way to form a global feature matrix; finally, the header data set uses a machine learning score through an open face data set. The classification performance of head attitude is better than the common global feature fusion matrix. (2) the accurate expression of the head posture and the high efficiency calculation. In the head attitude expression and the computing surface, the head attitude expression and computing face are used to improve the head. In this paper, the head attitude estimation algorithm based on the depth information reconstruction and the improved weighted version are proposed. Firstly, the LBP (Local binary pattern, LBP) feature of the head is extracted and the Adaboost-LBP face classifier is constructed. Then the depth information is reconstructed according to the camera imaging principle, and the depth of the face is reconstructed according to the camera imaging principle. Degree information and the geometric relationship between the target and the camera use the head attitude estimation algorithm based on the depth information reconstruction to calculate the head posture. In order to improve the accuracy of the construction depth information, ASM (Active shape model, ASM) method is used to extract the 68 point face contour model and construct the weighted depth information reconstruction algorithm. Finally, the optimization is used to optimize the algorithm. After the depth information combined with the head feature and appearance model, the head posture in the visual attention scene is experimentation, and the proposed head attitude estimation algorithm based on the depth information reconstruction and its improved weighted version are superior to the commonly used head attitude estimation methods in the two sides of the head attitude expression and computing performance. (3) The ambiguity of the posture of the head and the direction of the gaze. The field of visual attention includes two aspects of the head attitude and the direction of the gaze. A single head posture can correspond to multiple gaze directions, and the same gaze direction can also be in a different head posture. Therefore, the visual attention will be described with the head attitude or the direction of the gaze. In order to alleviate the ambiguity of head posture and gaze direction in the field of visual attention, this paper presents the visual attention algorithm of HMM (Hidden Markov Model, HMM) combined with gaze assistant. First, the head data is obtained by the deep convolution neural network and the head attitude and the direction of gaze is calculated. Then, the direction of the gaze is calculated by HMM. Combined with the head posture, the visual attention direction or target is predicted. Finally, the open head attitude data set and real-time video data are used to carry out experimental analysis. It shows that the gaze assisted visual attention algorithm proposed in this paper weakens the ambiguity of visual attention to a certain extent, and can improve the accuracy of the visual attention target prediction. The following conclusions are obtained from the isomorphic heterogeneous data validation of public data sets and video data. (1) a weighted information entropy feature fusion framework is used to fuse the features of Gabor and Phase Congruency, and the head attitude feature matrix is constructed, which not only fully expresses the local feature of the head, but also reduces the complexity of the global feature, and reaches the local level. With the global balance, the classification accuracy and time limitation of the head attitude estimation algorithm are improved. (2) the proposed head attitude estimation algorithm based on the depth information reconstruction and its improved weighted version can accurately reconstruct the depth information, improve the accuracy of the posture expression of the head and the timeliness of the attitude estimation. (3) the visual correlation of the gaze assistance proposed. The method of injection is used to predict the direction of visual attention or target by combining the direction of the gaze with the head posture by HMM, which alleviates the ambiguity of the direction of the head and the gaze in the visual attention algorithm, and reduces the error of visual attention.
【学位授予单位】:中国矿业大学(北京)
【学位级别】:博士
【学位授予年份】:2017
【分类号】:TP391.41
【相似文献】
相关期刊论文 前10条
1 於东军,赵海涛,杨静宇;人脸识别:一种基于特征融合及神经网络的方法(英文)[J];系统仿真学报;2005年05期
2 周斌;林喜荣;贾惠波;周永冠;;量化层多生物特征融合的最佳权值[J];清华大学学报(自然科学版);2008年02期
3 丁宝亮;;基于局部特征融合的人脸识别研究[J];中国新技术新产品;2012年14期
4 刘增荣;余雪丽;李志;;基于特征融合的图像情感语义识别研究[J];太原理工大学学报;2012年05期
5 黄双萍;俞龙;卫晓欣;;一种异质特征融合分类算法[J];电子技术与软件工程;2013年02期
6 刘冰;罗熊;刘华平;孙富春;;光学与深度特征融合在机器人场景定位中的应用[J];东南大学学报(自然科学版);2013年S1期
7 卞志国;金立左;费树岷;;特征融合与视觉目标跟踪[J];计算机应用研究;2010年04期
8 韩萍;徐建龙;吴仁彪;;一种新的目标跟踪特征融合方法[J];中国民航大学学报;2010年04期
9 何贤江;何维维;左航;;一种句词五特征融合模型的复述研究[J];四川大学学报(工程科学版);2012年06期
10 刘冬梅;;基于特征融合的人脸识别[J];计算机光盘软件与应用;2013年12期
相关会议论文 前7条
1 刘冰;罗熊;刘华平;孙富春;;光学与深度特征融合在机器人场景定位中的应用[A];2013年中国智能自动化学术会议论文集(第三分册)[C];2013年
2 翟懿奎;甘俊英;曾军英;;基于特征融合与支持向量机的伪装人脸识别[A];第六届全国信号和智能信息处理与应用学术会议论文集[C];2012年
3 卞志国;金立左;费树岷;;基于增量判别分析的特征融合与视觉目标跟踪[A];2009年中国智能自动化会议论文集(第三分册)[C];2009年
4 韩文静;李海峰;韩纪庆;;基于长短时特征融合的语音情感识别方法研究[A];第九届全国人机语音通讯学术会议论文集[C];2007年
5 罗昕炜;方世良;;宽带调制信号特征融合方法[A];中国声学学会水声学分会2013年全国水声学学术会议论文集[C];2013年
6 金挺;周付根;白相志;;一种简单有效的特征融合粒子滤波跟踪算法[A];2007年光电探测与制导技术的发展与应用研讨会论文集[C];2007年
7 孟凡洁;孔祥维;尤新刚;;基于特征融合的相机来源认证方法[A];全国第一届信号处理学术会议暨中国高科技产业化研究会信号处理分会筹备工作委员会第三次工作会议专刊[C];2007年
相关博士学位论文 前10条
1 王晓萌;基于特征融合的视觉关注算法研究[D];中国矿业大学(北京);2017年
2 周斌;多生物特征融合理论的研究与实验[D];清华大学;2007年
3 彭伟民;特征数据的量子表示与融合方法[D];华南理工大学;2013年
4 陈倩;多生物特征融合身份识别研究[D];浙江大学;2007年
5 蒲晓蓉;多模态生物特征融合的神经网络方法[D];电子科技大学;2007年
6 王志芳;基于感知信息的多模态生物特征融合技术研究[D];哈尔滨工业大学;2009年
7 王楠;基于多视觉特征融合的后方车辆检测技术研究[D];东北大学 ;2009年
8 徐颖;基于特征融合与仿生模式的生物特征识别研究[D];华南理工大学;2013年
9 樊国梁;基于多类特征融合的蛋白质亚线粒体定位预测研究[D];内蒙古大学;2013年
10 曾凡祥;复杂环境下鲁棒实时目标跟踪技术研究[D];北京邮电大学;2017年
相关硕士学位论文 前10条
1 付艳红;基于特征融合的人脸识别算法研究与实现[D];天津理工大学;2015年
2 许超;基于特征融合与压缩感知的实木地板缺陷检测方法研究[D];东北林业大学;2015年
3 杨文婷;基于微博的情感分析算法研究与实现[D];西南交通大学;2015年
4 梅尚健;基于特征融合的图像检索研究与实现[D];西南交通大学;2015年
5 王鹏飞;基于多慢特征融合的人体行为识别研究[D];西南大学;2015年
6 丁倩;基于语音信息的多特征情绪识别算法研究[D];山东大学;2015年
7 薛冰霞;基于多模特征融合的人体跌倒检测算法研究[D];山东大学;2015年
8 何乐乐;医学图像分类中的特征融合与特征学习研究[D];电子科技大学;2015年
9 戴博;基于结构复杂度特征融合的视觉注意模型研究及其应用[D];复旦大学;2014年
10 王宁;基于特征融合的人脸识别算法[D];东北大学;2013年
,本文编号:1941275
本文链接:https://www.wllwen.com/kejilunwen/ruanjiangongchenglunwen/1941275.html