当前位置:主页 > 科技论文 > 自动化论文 >

面向图像内容检索的卷积神经网络

发布时间:2018-06-17 08:47

  本文选题:卷积神经网络 + 图像分类 ; 参考:《杭州电子科技大学》2017年硕士论文


【摘要】:图像分类以及检索一直是图像领域的经典问题,随着移动互联网的快速发展,图片信息量也呈爆炸式增长,对海量图片信息的分类已经成为一个研究热点。传统的图像分类方法针对特定的图像由人工去设计特征,其鲁棒性较差,而且需要丰富的先验知识。卷积神经网络方法则在该领域取得了重大突破,它可以自动从海量图片中学习到属于原始图像的本质特征进行分类,相比传统方法具有更好地识别率和实用性。卷积神经网络模拟人的视觉系统,将特征的提取过程分为从低到高多个层次,以网络深度获得高度抽象特征,它直接将图片作为网络的输入,并且利用局部感受野、权值共享和子采样技术减少网络参数数量,从而避免权值数量过多导致过拟合,也使网络具有一定程度上的平移、旋转和扭曲不变性。目前,卷积神经网络已广泛应用于图像检索,其识别率和实用性均优于传统的分类方法,因此对卷积神经网络在图像内容检索上应用的研究具有十分重要的意义。本文主要从实际应用和网络改进两方面进行研究,论文的主要工作如下:(1)针对CNN网络模型设计的过程中,各参数如何选择的问题,通过调整CNN中卷积核的个数和大小、采样层的搭配方式以及激活函数进行对比实验,发现在增加卷积核个数、减小核尺寸、使用Relu激活函数、第一个采样层使用最大值采样这些情况下,CNN在MNIST和CIFAR-10数据库上的性能更好。(2)针对古玩图片数据集的分类,提出一种图片大小不一情况下数据预处理的方法,解决图片目标在格式统一时发生形变的问题;提出一种目标与背景分离后再输入到CNN的方法,并在古玩数据集进行实验验证该方法所用的CNN相比图片直接输入CNN,其网络结构更简单,识别率更高;通过实验验证CNN在图片包含多目标的情况下仍然具有优秀的分类性能;针对整个古玩数据集各类别样本数量不平衡的情况,提出CNN结合HOG+SVM的方法进行分类,并通过实验证明该方法比直接利用CNN分类的识别率要高。(3)针对CNN中常用的采样方式各有优缺点的情况,提出一种在采样层分别进行最大值采样和均值采样的网络模型(并行采样模型),实验验证该模型相比传统CNN泛化性能更好;另外,提出一种对CNN进行预训练,使网络训练时可以剔除噪声样本的方法,解决在训练样本中有噪声时直接训练网络会无法收敛的问题。
[Abstract]:Image classification and retrieval is a classic problem in the field of image. With the rapid development of mobile Internet, the amount of image information is also explosive growth, the classification of mass image information has become a research hotspot. Traditional image classification methods design features manually for a particular image, which is less robust and requires abundant prior knowledge. The convolutional neural network method has made a great breakthrough in this field. It can automatically learn the essential features of the original image from the massive images for classification. Compared with the traditional method, it has better recognition rate and practicability. Convolution neural network simulates human visual system, classifies the feature extraction process from low to high levels, obtains highly abstract features by network depth. It directly takes pictures as the input of the network, and uses the local receptive field. The techniques of weight sharing and subsampling reduce the number of network parameters so as to avoid overfitting caused by too many weights and make the network have the invariance of translation rotation and distortion to a certain extent. At present, convolution neural network has been widely used in image retrieval, its recognition rate and practicability are better than traditional classification methods, so it is very important to study the application of convolution neural network in image content retrieval. The main work of this paper is as follows: 1) aiming at the problem of how to select the parameters in the process of CNN network model design, we adjust the number and size of convolutional cores in CNN. The collocation of the sampling layer and the activation function are compared. It is found that when increasing the number of convolution kernels and reducing the size of the core, the Relu activation function is used. The first sampling layer uses maximum sampling in these cases CNN performs better on MNIST and CIFAR-10 databases. (2) aiming at the classification of antiques image data sets, a method of data preprocessing with different image sizes is proposed. In order to solve the problem that the image object is deformed when the format is unified, a method is proposed to separate the target from the background and then input it to CNN. Compared with CNN-based images, CNN has simpler network structure and higher recognition rate, and it has excellent classification performance in the case of multi-target images. In view of the imbalance in the number of different types of samples in the whole antique data set, a CNN combined with hog SVM method is proposed for classification. It is proved by experiments that the recognition rate of this method is higher than that of using CNN classification directly.) the sampling methods commonly used in CNN have their own advantages and disadvantages. In this paper, a network model of maximum sampling and mean sampling in sampling layer is proposed. The experiment results show that the proposed model has better generalization performance than traditional CNN, and a new network model is proposed to pretrain CNN. The method of eliminating noise samples can be used in network training to solve the problem that the direct training network can not converge when there is noise in the training samples.
【学位授予单位】:杭州电子科技大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TP391.41;TP183

【参考文献】

相关期刊论文 前3条

1 舒文娉;刘全香;;基于支持向量机的印品缺陷分类方法[J];包装工程;2014年23期

2 应义斌;桂江生;饶秀勤;;基于Zernike矩的水果形状分类[J];江苏大学学报(自然科学版);2007年01期

3 李向阳,庄越挺,潘云鹤;基于内容的图像检索技术与系统[J];计算机研究与发展;2001年03期



本文编号:2030444

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/zidonghuakongzhilunwen/2030444.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户b8e83***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com