基于多监督信息的级联全卷积人脸检测算法

发布时间：2018-06-05 09:13

本文选题：人脸检测 + 深度学习　；参考：《哈尔滨工业大学》2017年硕士论文

【摘要】：目前,人脸检测是计算机领域重点研究方向之一,无论在科研领域还是商用领域都具有较大的研究价值。在科研领域上,检测问题是计算机视觉领域的一个基础性课题。人脸检测在人脸领域中作为人脸对齐、人脸识别、人脸验证的基础问题,一直被大量的研究。实现在商业领域中,安防,金融,卡口等自然复杂场景中自动快速准确地检测并识别人脸是工业界的目标。但由于真实场景的复杂性,以及人脸检测容易受到人脸姿态、角度、位置、背景等诸多因素变化的影响,导致人脸检测仍有许多难点没有彻底解决。人脸检测任务可以拆分成判别人脸的二分类子任务和人脸搜索策略子任务。人脸二分类子任务是判断某一个区域是否是人脸,人脸搜索策略子任务则是在整张图片中搜索人脸。人脸二分类模型越简单,且人脸搜索的区域越少,则人脸检测算法的运行速度就越快。目前人脸检测算法处于顾此失彼的困境。效果好的算法由于模型过于复杂,运行速度没办法达到实时,而能达到实时的算法因为模型比较简单,导致算法本身的效果却不是很好。如何训练出一个速度又快,而且效果又好的人脸检测是人们非常关心的问题。目前主流方法将这两个子任务独立看待,即分类模型使用深度卷积网络,人脸搜索采用滑动窗口、选择性搜索等策略,导致计算资源消耗过多,检测时间过长等问题。本文提出了一个模型简单,检测准确,且能同时检测多个尺度的人脸检测算法。本文从全卷积网络出发,将这两个子任务有效的结合起来,训练出一个不限制人脸尺度大小的人脸检测模型。全卷积网络和一般的卷积神经网络相比有着空间位置信息的优势,故相比于一般的卷积神经网络来说,全卷积网络更适合做检测、分割等任务。在测试的时候,就不需要将图像放缩多个尺度再分别测试,而是通过一次测试将大部分人脸检测出来。因为人脸尺度是非限定的,故直接训练人脸二分类模型会导致模型难以收敛或者效果较差,会产生较多的误检和漏检。为了能减少这样的问题,让模型更好地训练,本论文将级联思想引入全卷积网络中,而且在人脸二分类和人脸边界框回归中分别提出多个监督信息帮助网络更好地收敛,构造出一个基于多监督信息的全卷积人脸检测算法。本文通过在人脸检测相关评测集进行实验,一一验证提出算法的有效性,实验结果表明该算法在评测集上取得了不错的成绩。
[Abstract]:At present, face detection is one of the most important research directions in computer field, and it has great research value in both scientific and commercial fields. In the field of scientific research, detection is a basic subject in the field of computer vision. Face detection is the basic problem of face alignment, face recognition and face verification in face field. It is the target of industry to detect and recognize faces automatically and accurately in complex natural scenes such as security, finance, bayonet and so on. However, due to the complexity of the real scene and the fact that face detection is easily affected by the changes of face pose, angle, location, background and so on, there are still many difficulties in face detection that have not been solved thoroughly. The face detection task can be divided into two subtasks of face recognition and a subtask of face search strategy. The second subtask of face classification is to determine whether a certain region is a face or not, and the subtask of face search strategy is to search the face in the whole picture. The simpler the two-classification model is and the less the search area is, the faster the face detection algorithm is. At present, face detection algorithm is in the dilemma of neglecting one or the other. Due to the complexity of the model, the algorithm can not achieve real-time, but the algorithm can achieve real-time because the model is relatively simple, the result of the algorithm itself is not very good. How to train a fast and effective face detection is a very important problem. At present, the two subtasks are treated independently, that is, the classification model uses deep convolution network, the face search adopts sliding window, selective search and so on, which results in the excessive consumption of computing resources and the long detection time. In this paper, a face detection algorithm is presented, which is simple, accurate and can detect multiple scales at the same time. Based on the full convolution network, this paper combines these two subtasks effectively and trains a face detection model without limiting the face size. Compared with the general convolution neural network, the full convolution network has the advantage of spatial location information, so compared with the general convolution neural network, the full convolutional network is more suitable for the tasks of detection, segmentation and so on. In the test, we don't need to shrink the image to multiple scales and test separately. Instead, we detect most human faces by one test. Because the scale of face is not limited, direct training of the two-classification model of face will lead to the difficulty of convergence or the poor effect of the model, and will result in more misinformation and missed detection. In order to reduce this problem and train the model better, this paper introduces the concatenation idea into the full convolution network, and proposes several supervised information to help the network converge better in face two-classification and face boundary box regression. A full convolution face detection algorithm based on multi-supervised information is proposed. In this paper, the validity of the proposed algorithm is verified by experiments on the relevant evaluation set of face detection. The experimental results show that the algorithm has achieved good results in the evaluation set.
【学位授予单位】：哈尔滨工业大学
【学位级别】：硕士
【学位授予年份】：2017
【分类号】：TP391.41

【相似文献】