基于深度学习的人脸特征点定位及识别技术研究

发布时间：2018-06-13 17:59

本文选题：深度学习 + 人脸特征点定位　；参考：《北京邮电大学》2016年硕士论文

【摘要】：人脸识别作为一种重要的生物特征识别技术,在刑侦、金融、企业管理、智能交通等众多领域都有着广泛的应用前景。但要把人脸识别技术真正运用到实际中,还有诸多问题需要解决。例如,人脸特征点的精确定位,姿态变化、低分辨率和部分遮挡等条件下的人脸识别问题等。近年来,随着深度学习方向的研究不断深入,其在人脸识别领域也得到了广泛运用,在众多公开数据集上取得了突破性的研究成果。由于它强大的表达能力,能够拟合各种非线性函数,因此对于解决上述问题有着极大的优势。本文正是在此背景下,分别对现有深度学习在人脸特征点定位、人脸识别上的算法做深入分析与总结,并提出了基于多任务学习(multi-task learning)深度卷积网络的人脸特征点定位算法以及基于多模态表示(multimodal representation)深度卷积网络的人脸识别算法,分别在实验数据集上获得了良好的性能表现。在人脸特征点定位方面,近两年来比较流行的方法多采用级联的深度模型作为基本框架,如CFCNN[67]和CFAN[68]网络。然而,这种级联(cascade)结构会使模型的训练和预测效率降低,且并未考虑对人脸姿态变化的鲁棒性。本文提出的基于多任务学习的深度卷积神经网络(Deep Convolutional Neural Networks,DCNN)将人脸特征点定位问题作为主要任务,头部姿态检测任务作为辅助任务,对两者用深度卷积神经网络联合学习,从而获得人脸特征点定位对于头部姿态的鲁棒性,最终在AFLW[70]数据集上达到了与级联结构相匹配甚至更高的特征点定位精度,以及更短的预测时间。在人脸识别方面,近年来DeepID[72]、FaceNet[13]等基于DCNN的模型取得了非常好的人脸识别效果,但仍然是人脸的单一模态表示,很难对抗姿态的变化。本文提出的基于多模态表示的深度卷积网络主要有两点改进:其一是用多个并行的深度卷积网络分别提取全局、局部以及姿态恢复后的人脸图像特征,从而能够得到对姿态、部分遮挡等具有不变性的特征。其二是将堆叠自动编码器(Stacked Auto-encoders,SAE)代替传统主成分分析(Principal Component Analysis,PCA)方法运用于特征的降维,以获得更具非线性的特征变换。在LFW[13]和CASIA-WebFace[80]数据集上分别评测模型的人脸认证准确率与人脸辨识准确率,均优于常规的DCNN模型。
[Abstract]:As an important biological feature recognition technology, face recognition has wide application prospects in many fields, such as criminal investigation, finance, enterprise management, intelligent transportation and so on. But there are many problems to be solved in real application of face recognition technology. For example, the precise location of face feature points, attitude change, low resolution and part. In recent years, with the in-depth study of the depth learning direction, it has also been widely used in the field of face recognition. It has made a breakthrough in many public data sets. Because of its strong expressive ability, it can fit all kinds of nonlinear functions, so it can solve the problem. In this context, this paper makes an in-depth analysis and summary of the existing deep learning algorithms on face feature point positioning, face recognition, and proposes a face feature location algorithm based on Multi-task learning deep convolution network and multi mode representation (multimoda). L representation) face recognition algorithms in deep convolution networks have achieved good performance in experimental data sets. In the aspect of face feature point location, cascaded depth models are used as the basic framework, such as CFCNN[67] and CFAN [68] networks. However, this cascade (cascade) structure will make it possible The training and prediction efficiency of the model is reduced, and the robustness of the face attitude change is not considered. The Deep Convolutional Neural Networks (DCNN) based on multi task learning (DCNN) takes the face feature point location as the main task, and the head attitude detection task is used as an auxiliary task to use the deep convolution task as an auxiliary task. The degree convolution neural network combines learning to obtain the robustness of face feature point positioning for the head posture, and finally achieves the location precision of feature points that match even the cascade structure on the AFLW[70] data set, as well as the shorter prediction time. In the face recognition, DeepID[72], FaceNet[13] and other DCNN based modules in recent years. It has a very good face recognition effect, but it is still a single modal representation of the face. It is difficult to resist the change of attitude. The proposed depth convolution network based on multimodal representation has two improvements: one is to use multiple parallel deep convolution networks to extract global, local and postpose facial images. In addition, the Stacked Auto-encoders (SAE) instead of the traditional principal component analysis (Principal Component Analysis, PCA) method is applied to the dimensionality reduction of the feature to obtain a more nonlinear feature transformation. In LFW[13] and CASIA-WebFace[80] numbers, the other is to replace the traditional principal component analysis (Principal, PCA) method. The accuracy of face authentication and the accuracy of face recognition are better than those of the conventional DCNN model.
【学位授予单位】：北京邮电大学
【学位级别】：硕士
【学位授予年份】：2016
【分类号】：TP391.41;TP18

【参考文献】