基于深度学习的图像识别方法研究与应用

发布时间：2019-05-09 14:59

【摘要】：图像识别是图像研究领域中的一个重要研究方向,也是机器视觉中的热点研究问题,具有非常重大的意义。深度学习,近些年在图像、语音、文本等方面取得了许多成果。同时,深度学习在人工智能领域占据着重要的地位,在日常生活中受到广泛的应用和关注。传统的图像识别方法需要人工设计特征,相对依赖图像识别经验丰富的研究学者,且传统的方法图像识别率较低。随着互联网和信息技术的发展,大数据背景下产生的海量图像数据,传统的识别方法已经不能满足我们的需求。而深度学习是一个多层的网络结构,通过模拟人脑,能够自动的学习和提取特征,充分发挥大数据的优势。因此,本文将深度学习和图像识别相结合,研究如何提高图像的识别率,具有一定的研究空间和研究价值。本文首先阐述了图像识别和深度学习的理论,与浅层学习相比,深度学习能够容易的表达复杂函数,具有很强的泛化能力。同时,还探讨了几种常用的深度学习模型及其算法原理,研究了图像的特征提取和识别方法。本文在研究深度神经网络的基础上,针对原始的初始化权重方法造成的网络学习速度慢的问题,提出了改进的初始化权重方法。同时,在理论和实验上验证了该方法的有效性,还可以将其运用到常用的卷积神经网络和深度信念网络中。其次,由于深度神经网络存在梯度消失的问题。同时,深度信念网络的半监督学习特点,可以挖掘大量无标签数据的价值。因此,本论文提出了改进的深度信念网络学习模型。通过实验证明,该模型的学习速度和识别正确率都得到提高。相对于未改进的深度信念网络,该模型在MNIST数据集上的识别率达到了99.18%,提高了 0.62%,在CIFAR-10数据集上的识别率提高了 9.6%。最后,针对卷积神经网络特别适合处理与图像相关的问题,本文提出了改进的卷积神经网络模型。该模型首先采用改进的初始化权重方法代替原始的初始化方法;然后去掉池化层,采用SVM分类器替代了原始的softmax层;最后对激活函数进行了改进,改进后的函数结合了 Sigmoid函数的光滑性和ReLU函数的稀疏性及快速收敛性等特点,同时引入了 Dropout思想,目的是为了增强网络泛化的能力,防止网络过拟合。该模型在MNIST数据集上的识别率达到了 99.52%,相对于未改进的卷积神经网络,提高了 0.66%,与传统方法相比,提高了 5%左右。在CIFAR-10数据集上,与未改进的卷积神经网络相比,识别正确率提高了 6.4%,与传统方法相比,提高了 9%左右。通过实验表明,该模型的有效性得到验证,表现效果较好,图像的识别率得到提高。
[Abstract]:Image recognition is an important research direction in the field of image research, and it is also a hot research topic in machine vision, which is of great significance. In recent years, in-depth learning has made a lot of achievements in image, voice, text and so on. At the same time, deep learning occupies an important position in the field of artificial intelligence and has been widely used and concerned in daily life. The traditional image recognition method needs manual design features, which depends on the experienced researchers of image recognition, and the image recognition rate of the traditional method is low. With the development of Internet and information technology, the traditional recognition methods can not meet our needs for the massive image data produced under the background of big data. Deep learning is a multi-layer network structure, which can automatically learn and extract features and give full play to big data's advantages by simulating the human brain. Therefore, this paper combines depth learning with image recognition to study how to improve the recognition rate of images, which has a certain research space and research value. In this paper, the theory of image recognition and deep learning is expounded. Compared with shallow learning, deep learning can express complex functions easily and has strong generalization ability. At the same time, several kinds of depth learning models and their algorithm principles are discussed, and the feature extraction and recognition methods of images are studied. In this paper, based on the study of deep neural network, an improved initialization weight method is proposed to solve the problem of slow network learning speed caused by the original initialization weight method. At the same time, the effectiveness of the method is verified theoretically and experimentally, and it can also be applied to convolution neural networks and deep belief networks. Secondly, the depth neural network has the problem of gradient disappearance. At the same time, the semi-supervised learning characteristics of the deep belief network can mine the value of a large amount of untagged data. Therefore, this paper proposes an improved in-depth belief network learning model. The experimental results show that the learning speed and recognition accuracy of the model are improved. Compared with the unimproved deep belief network, the recognition rate of the model on MNIST dataset is 99.18%, increased by 0.62%, and the recognition rate on CIFAR-10 dataset is increased by 9.6%. Finally, an improved convolution neural network model is proposed to deal with image-related problems. In this model, the improved initialization weight method is used to replace the original initialization method, and then the pooling layer is removed, and the SVM classifier is used to replace the original softmax layer. Finally, the activation function is improved. the improved function combines the smoothness of Sigmoid function and the sparsity and fast convergence of ReLU function, and introduces the idea of Dropout in order to enhance the ability of network generalization. Prevent network overfitting. The recognition rate of the model on MNIST dataset is 99.52%. Compared with the unimproved convolution neural network, the recognition rate of the model is increased by 0.66%, which is about 5% higher than that of the traditional method. On CIFAR-10 datasets, compared with the unimproved convolution neural network, the recognition accuracy is improved by 6.4% and by about 9% compared with the traditional method. The experimental results show that the effectiveness of the model is verified, the performance is better and the recognition rate of the image is improved.
【学位授予单位】：华中师范大学
【学位级别】：硕士
【学位授予年份】：2017
【分类号】：TP391.41

【参考文献】