基于深度学习的手语识别方法研究

发布时间：2018-03-03 17:35

本文选题：手语识别　切入点：深度学习　出处：《吉林大学》2017年硕士论文　论文类型：学位论文

【摘要】：手语作为聋哑人交流的重要工具,在聋哑人中有着广泛的使用价值,而对其复杂多变的手势的研究也能促进基于手势的人机交互技术的发展。但正是由于手语的复杂多变以及其复杂的使用环境,对手语识别的研究一直困难重重。传统的手语识别研究方法往往要求手语者佩戴昂贵的用于手语信息捕捉的数据手套或者是要求手语者佩戴彩色手套,以方便对手语的手势进行特征提取等操作,利用这类方法虽然能在限定的使用条件下达到较高的准确率。但这类方法推广性较差,往往在更换一个手语数据集后就得重新手动的提取特征。本文将深度学习的系列方法引入到手语识别的研究中,具体的在静态手语识别方面本文结合深度卷积神经网络提出了静态手语识别模型一(SLR-CNN1)和静态手语识别模型二(SLR-CNN2)。利用SLR-CNN1验证了深度卷积神经网络在手语识别上的可行性。利用SLR-CNN2模型进一步提高了静态手语识别的准确率,本文将全局均值池化引入到手语识别模型中,极大的降低了参数数量,防止过拟合现象的发生。通过大量实验验证了深度卷积神经网络可以自动的学习到有用的手语特征,且深度卷积神经网络能学习到手语的细微变换,从而可以有效的对手语进行识别。本文还利用深度学习Caffe框架训练了两个可以用于实际部署的深度学习手语识别模型。在动态手语识别方面,本文将深度卷积神经网络和长短时记忆循环神经网络结合,提出了动态手语识别模型一(SLR-LSRCN1)和动态手语识别模型二(SLRLSRCN2)。并对深度学习框架Caffe的源码进行修改,使其能接受连续的视频帧作为深度学习模型的输入。通过大量实验得出利用卷积神经网络和循环神经网络结合的方式,可以对动态手语做出有效的识别。在此基础上训练了可用于实际部署的动态手语识别模型。最后为了验证深度学习算法在手语识别上的可行性,本文通过结合现有数据库和自录数据库的方式,标记了大量的可用于静态手语识别的样本库,可以更方便的进行算法的验证和实验。本文通过将深度学习的方法引入到手语识别任务中,为手语识别增加了一条可扩展性强,具有鲁棒性的新思路。
[Abstract]:Sign language is an important tool for communication among deaf and mute people. The research on its complex and changeable gestures can also promote the development of human-computer interaction technology based on gestures, but it is precisely because of the complexity of sign language and its complex use environment, Research on sign language recognition has been difficult. Traditional sign language recognition methods often require sign language users to wear expensive data gloves for sign language information capture or color gloves for sign language users. In order to facilitate sign language gesture feature extraction and other operations, although the use of this method can achieve a higher accuracy under limited conditions of use, but this kind of method is less popularizing. After replacing a sign language data set, we often have to re-extract the features manually. In this paper, a series of in-depth learning methods are introduced into the study of sign language recognition. In the aspect of static sign language recognition, this paper proposes a static sign language recognition model (SLR-CNN1) and a static sign language recognition model (SLR-CNN2) combined with deep convolution neural network. The SLR-CNN1 is used to verify the feasibility of deep convolution neural network in sign language recognition. Using SLR-CNN2 model to further improve the accuracy of static sign language recognition, In this paper, the global mean pool is introduced into the sign language recognition model, which greatly reduces the number of parameters and prevents over-fitting. Through a large number of experiments, it is verified that the deep convolution neural network can automatically learn useful sign language features. And the deep convolution neural network can learn the subtle transformation of sign language. So we can effectively recognize sign language. In this paper, we also train two Deep-Learning sign language recognition models which can be used in actual deployment by using the Caffe framework of in-depth learning. In the aspect of dynamic sign language recognition, In this paper, the deep convolution neural network and the long and short time memory circulatory neural network are combined, and a dynamic sign language recognition model (SLR-LSRCN1) and a dynamic sign language recognition model (SLRLSRCN2) are proposed. The source code of the deep learning framework (Caffe) is modified. It can accept the continuous video frame as the input of the depth learning model. Through a lot of experiments, the method of combining the convolution neural network with the cyclic neural network is obtained. On the basis of this, the dynamic sign language recognition model which can be used in actual deployment is trained. Finally, in order to verify the feasibility of the deep learning algorithm in sign language recognition, In this paper, a large number of sample libraries for static sign language recognition are marked by combining the existing database and the self-recording database. This paper introduces the method of deep learning into the task of sign language recognition, and adds a new idea with strong extensibility and robustness for sign language recognition.
【学位授予单位】：吉林大学
【学位级别】：硕士
【学位授予年份】：2017
【分类号】：TP391.41

【参考文献】