基于分类导向的三维联合头部姿态估计与人脸关键点定位

发布时间：2019-01-01 18:23

【摘要】：互联网开启了数字化的时代,利用机器学习和深度学习的方法提取大量数据信息中的高层知识进行学习以完成人机交互成为一大研究热点。人机交互的关键首先在于根据不同的交互需求识别人体的特定部位的特征,生物识别作为一种快捷、友好的身份识别特征应运而生。现有的较为成熟的生物识别技术包括虹膜识别,指纹识别,语音识别,步态识别以及人脸识别等,其中人脸作为一种重要的生物识别特征,由于其具有提取便利以及非侵犯性的特点,更能够被受试者所接受,也因此促使该领域的研究不断成熟。研究人体的头部姿态,以及眼角、鼻尖、嘴巴、下巴等人脸关键点是人脸分析领域的关键性问题,这两个问题已经能够在图片上获得不错的结果,但是基于图片的方法大都对光照的敏感性较强,且不能很好的处理具有大角度头部偏转的人脸以及遮挡的情况。由于三维扫描仪器制造成本的不断降低,扫描数据精度的逐步提升,以及深度数据自身包含的丰富几何信息这一特点,使得越来越多的研究者将深度信息应用于人脸分析领域。头部姿态估计与人脸关键点定位常被分为两个问题独立研究,但是头部姿态估计的结果可以为脸部关键点定位提供很好的空间变换信息,同时脸部关键点的结构又可以反映头部姿态向量的数值,因此如何将两者结合起来优化是本文的一个核心问题。本文提出了一种基于分类导向的3D联合头部姿态估计与人脸关键点定位方法。首先,分类导向是指将头部姿态空间分为若干个类,在各类中分别执行脸部关键点定位算法。这样做可以保证在同一姿态空间下,头部点云数据缺失部位相对一致,对关键点定位算法稳定性的提升大有帮助。其次,本文提出了联合的概念,在级联的随机森林回归框架下,将头部姿态估计的结果与标记有关键点的人脸模板相结合,为级联回归的初始化阶段提供一个很好的初值,且级联过程中每一阶段的关键点定位结果又可以反过来优化头部姿态向量。最后,本文给出了一个三维人脸数据库,它包含了不同身份、不同表情以及不同的头部姿态数据,且给定了头部姿态向量与人脸关键点形状向量的真实值。丰富的实验展示了本方法的有效性和高效性。本文中的方法在BIWI以及B3D(AC)2两个常用的三维数据库上均取得了较现有的方法更为精确的结果。另外,本文的方法也适用于其他涉及姿态估计和关键点定位领域,具有一定的泛化能力。
[Abstract]:The Internet has opened a digital era. It has become a research hotspot to extract the high-level knowledge from a large amount of data information to complete human-computer interaction by means of machine learning and in-depth learning. The key of human-computer interaction is to recognize the characteristics of specific parts of human body according to different interactive requirements. As a kind of quick and friendly identification features biometrics emerge as the times require. The existing biometrics include iris recognition, fingerprint recognition, speech recognition, gait recognition and face recognition, among which face is an important biometric feature. Because of its advantages of easy extraction and non-invasiveness, it can be accepted by the subjects, and the research in this field is becoming more and more mature. Studying the head posture of the human body, as well as the key points of the human face, such as the corners of the eye, nose tip, mouth, chin and so on, are the key problems in the field of face analysis. These two questions have been able to get good results in the pictures. However, most of the image-based methods are sensitive to illumination, and can not deal with face and occlusion with large angle head deflection. Because of the decreasing of manufacturing cost of 3D scanning instrument, the gradual improvement of scanning data precision and the rich geometric information contained in depth data, more and more researchers apply depth information to face analysis. Head pose estimation and face key point localization are often divided into two independent studies, but the results of head pose estimation can provide good spatial transformation information for face key point location. At the same time, the structure of the key points of the face can also reflect the value of the head attitude vector, so how to combine the two to optimize is a core problem in this paper. In this paper, a classification-oriented 3D joint head pose estimation and face key point location method is proposed. First of all, classification orientation refers to the head attitude space is divided into several classes, in which the face key point location algorithm is implemented. In the same attitude space, the missing position of the head point cloud data is relatively consistent, which is helpful to improve the stability of the key point location algorithm. Secondly, in this paper, the concept of joint is proposed. In the framework of cascaded stochastic forest regression, the result of head attitude estimation is combined with the face template with key points, which provides a good initial value for the initialization stage of cascade regression. The key point location results in each stage of the cascade process can be used to optimize the head attitude vector in turn. Finally, this paper presents a 3D face database, which includes different identities, different expressions and different head pose data, and gives the real values of head pose vector and face key point shape vector. Experiments show the effectiveness and efficiency of this method. The method in this paper is more accurate than the existing methods in BIWI and B3D (AC) 2. In addition, the method proposed in this paper is also applicable to other fields involving attitude estimation and key point location, and has a certain generalization ability.
【学位授予单位】：中国科学技术大学
【学位级别】：硕士
【学位授予年份】：2017
【分类号】：TP391.41

【参考文献】