基于卷积神经网络的图像分类算法研究

发布时间：2018-07-10 12:56

本文选题：卷积神经网络 + 图像分类　；参考：《济南大学》2017年硕士论文

【摘要】：随着经济的发展和社会的高速进步,图像数据在我们日常生活中发挥着越来越重要的作用。图像数据的爆炸式增长使得需要分类的事物种类越来越多,而且被分类的对象内容也越来越复杂。传统的图像分类方法已经不能满足现实应用的需要,如何在大数据下提高图像分类的准确率意义重大。卷积神经网络(Convolutional Neural Network,CNN)是一个新型的人工神经网络方法,在处理二维图像领域表现出良好的性能,因此卷积神经网络被广泛地应用在图像分类领域。图像分类的正确率受卷积神经网络结构的影响,因此研究卷积神经网络结构优化问题具有重要的理论价值和实用价值。本文分析了卷积神经网络的基本概念和算法,在经典的卷积神经网络基础上,主要进行了以下两方面的工作:(1)基于经典的PCA network(PCANET)网络结构,在非线性激活函数之前引入maxout神经网络,用softmax分类器替换SVM分类器,构建了PCA非监督预训练的maxout卷积神经网络。该网络参数求解过程中不需要调参技巧,训练时间短,卷积核求解不需要反复迭代,且能适应不同的图像分类任务。网络的整体流程分为五个阶段:第一个阶段:PCA非监督预训练学习滤波器,学习到的滤波器与图片进行卷积提取图像的特征;第二个阶段:提取的特征经过maxout神经网络后再输入到非线性激活函数Relu中;第三个阶段:非线性激活函数的输出进行二值化,得到新的特征图;第四个阶段:新的特征图分块直方图统计,列向量化输入全连接层中;第五个阶段:利用softmax分类器进行分类。在手写体MNIST及其变形数据库和自然图像CIFAR-10数据库上的实验结果表明,PCA非监督预训练的maxout卷积神经网络的分类准确率有一定程度的提高。(2)基于经典的Network in Network(NIN)网络结构,对输入图像像素重构,构建了基于双边滤波的多路径卷积神经网络。该网络减少了复杂图像特征提取过程中前景物体纹理和形状信息的丢失。网络的输入为两个路径,一个路径输入原始的图像,另一个路径输入原始图像像素重构之后的图像,两个路径独立地提取特征,最后在均值降采样层之后将两个路径提取的特征向量进行合并,输入softmax分类器中进行分类。在自然图像CIFAR-100数据库上,分析图像的复杂性和卷积神经网络在不同复杂度图像上的学习曲线,得出卷积层和降采样层提取的特征向量中前景物体纹理和形状信息的丢失导致复杂度高的图像易被错误分类。在自然图像CIFAR-10和CIFAR-100数据库上,通过实验验证了基于双边滤波的多路径卷积神经网络取得的图像分类准确率优于传统的单路径卷积神经网络。
[Abstract]:With the development of economy and the rapid progress of society, image data plays a more and more important role in our daily life. The explosive growth of image data makes more and more kinds of things need to be classified, and the contents of objects are becoming more and more complex. Traditional image classification methods can not meet the needs of practical applications. How to improve the accuracy of image classification under big data is of great significance. Convolutional neural network (CNN) is a new artificial neural network method, which has good performance in two-dimensional image processing. Therefore, convolutional neural network is widely used in image classification. The accuracy of image classification is influenced by the network structure of convolution neural network, so it is of great theoretical and practical value to study the optimization of convolution neural network structure. In this paper, the basic concepts and algorithms of convolution neural network are analyzed. On the basis of classical convolution neural network, the following two main works are carried out: (1) based on the classical PCA network (PCANET) network structure, The maxout neural network is introduced before the nonlinear activation function and the softmax classifier is replaced by the softmax classifier to construct the unsupervised maxout convolution neural network. The parameters of the network do not need parameter adjustment technique, the training time is short, the convolution kernel solution does not need repeated iterations, and it can adapt to different image classification tasks. The whole flow of the network is divided into five stages: the first stage is the unsupervised pretraining filter of the PCA, which extracts the feature of the image by convolution between the filter and the picture. In the second stage, the extracted features are input into the nonlinear activation function Relu after the maxout neural network, the third stage: the output of the nonlinear activation function is binary, and the new feature diagram is obtained. The fourth stage: new feature graph block histogram statistics, column quantization input into the full join layer, the fifth stage: the use of softmax classifier to classify. The experimental results on handwritten MNIST and its deformation database and natural image CIFAR-10 database show that the classification accuracy of unsupervised maxout convolution neural network is improved to some extent. (2) based on the classical Network in Network (NIN) network structure, the classification accuracy of maxout convolution neural network is improved to some extent. The multipath convolution neural network based on bilateral filtering is constructed for pixel reconstruction of input image. The network reduces the loss of texture and shape information of foreground objects in the process of feature extraction of complex images. The input of the network is two paths, one path inputs the original image, the other path inputs the original image pixel reconstructed image, and the two paths extract the feature independently. Finally, the feature vectors extracted from the two paths are merged after the average downsampling layer, and the feature vectors are input into the softmax classifier for classification. In the CIFAR-100 database of natural images, the complexity of images and the learning curves of convolution neural networks on different complexity images are analyzed. It is concluded that the loss of texture and shape information of foreground objects in the feature vectors extracted by convolution layer and de-sampling layer leads to high complexity images being easily misclassified. On the CIFAR-10 and CIFAR-100 databases of natural images, the experimental results show that the classification accuracy of the multi-path convolution neural network based on bilateral filtering is better than that of the traditional single-path convolution neural network.
【学位授予单位】：济南大学
【学位级别】：硕士
【学位授予年份】：2017
【分类号】：TP391.41;TP183

【参考文献】