基于卷积神经网络的计算机视觉关键技术研究
发布时间:2018-08-14 14:08
【摘要】:近年来,深度学习技术的研究引起了学术界和工业界的广泛兴趣,推动了人工智能领域一系列应用研究的快速发展。卷积神经网络作为深度学习领域的一个重要研究分支,与计算机视觉相关技术的研究联系尤为紧密。随着网络结构的不断优化以及海量数据集的出现,卷积神经网络近年来在一系列的计算机视觉应用领域中取得了突破性的研究进展。然而,计算机视觉作为一个研究内容相当广泛的领域,无论在特定技术的深度还是领域广度的研究方面,仍然存在着很大的研究空间。计算机视觉领域可以分为三个层次的研究:低层特征研究、中层语义特征表达和高层语义理解。基于卷积神经网络,本文针对计算机三个层次中的代表性关键技术进行了探索,分别是物体识别、场景标注和场景识别。具体的三大研究内容及其分别包含的创新点包括:第一,在物体识别领域,针对单列卷积神经网络容易过拟合的问题,研究了异构多列卷积神经网络在物体识别应用中的效果。基于主流数据集的实验表明,异构多列卷积神经网络相比于单列的卷积神经网络能有效提高网络的泛化性能。针对传统的网络融合存在的融合方式单一,泛化性能较差的问题,提出了一种基于滑动窗口的网络融合策略。滑动窗口融合策略针对不同网络中输出的置信度信息进行有选择的融合,相比于传统单一的网络融合方式,滑动窗口融合策略是一种更加一般化的方法,并且兼容了经典的融合策略,能够有效提高网络融合的效果。第二,在场景标注领域,提出了一种基于卷积神经网络的场景标注方法,并且在主流的室内室外场景标注数据集中,均取得了优于经典算法的场景标注效果。针对场景标注中的特征学习问题,研究了基于训练的卷积神经网络特征和通用的卷积神经网络特征在场景标注任务中的应用效果。针对传统场景标注算法中存在标注结果区域一致性问题,提出了区域一致性激励算法。区域一致性激励算法利用场景图像中的全局边缘概率,迭代地对场景标注的区域一致性效果进行改善。基于公共数据集上的实验表明,区域一致性激励算法相比于经典的同类算法能够取得更好的场景标注准确度和视觉一致性。第三,在场景识别领域,提出了一种基于多尺度显著区域特征学习的场景识别方法,并且在公共数据集的实验中取得了相比于同类经典算法更好的场景识别效果。针对场景图像内容信息较为复杂的问题,提出了一种显著区域的判别策略,并且利用显著区域的多尺度信息对一幅场景图像进行表达。针对传统人工设计特征在场景识别任务中的判别性能较弱的问题,利用了卷积神经网络的迁移学习策略,在多尺度的显著区域对场景图像特征进行学习,完成特征表达。实验表明,基于多尺度显著区域的特征学习策略能有效提高场景识别的准确度。此外,卷积神经网络的迁移学习特征相比于传统的人工设计特征具有更好的判别性能。本文基于卷积神经网络针对计算机视觉的三个关键技术进行了研究。针对每个具体问题,设计了卷积神经网络的结构和应用模式,也针对特定领域的具体问题提出了一些有效的解决方法。基于公共数据集的实验表明,本文提出的方法在相应的领域中能够取得相比于经典的传统方法更好的实验结果。
[Abstract]:In recent years, the research of deep learning technology has aroused widespread interest in academia and industry, and has promoted the rapid development of a series of Applied Research in the field of artificial intelligence.As an important branch of deep learning, convolutional neural network is especially closely related to the related technology of computer vision. With the emergence of discontinuous optimization and massive data sets, convolutional neural networks have made breakthroughs in a series of computer vision applications in recent years. However, computer vision, as an area of considerable research content, still has a great deal of research in the depth and breadth of a particular technology. Research space. The field of computer vision can be divided into three levels: low-level feature research, middle-level semantic feature expression and high-level semantic understanding. The main research contents and their innovations include: Firstly, in the field of object recognition, aiming at the problem that single-column convolution neural network is easy to over-fit, the effect of heterogeneous multi-column convolution neural network in object recognition is studied. Integral neural network can effectively improve the generalization performance of the network. Aiming at the problems of single fusion mode and poor generalization performance in traditional network fusion, a network fusion strategy based on sliding window is proposed. In a single network fusion mode, sliding window fusion strategy is a more general method, and compatible with the classical fusion strategy, which can effectively improve the effect of network fusion. Second, in the field of scene annotation, a scene annotation method based on convolution neural network is proposed, and the number of indoor and outdoor scene annotations is mainstream. In view of the problem of feature learning in scene annotation, the application effect of convolution neural network features based on training and general convolution neural network features in scene annotation tasks is studied. A region consistency incentive algorithm is proposed. The region consistency incentive algorithm improves the region consistency of scene annotation iteratively by utilizing the global edge probability of scene image. Experiments on a common data set show that the region consistency incentive algorithm can achieve better scene scales than the classical algorithm. Thirdly, in the field of scene recognition, a method of scene recognition based on multi-scale salient region feature learning is proposed, and a better result of scene recognition is obtained in the experiment of common data set than that of other classical algorithms. A method of distinguishing salient regions is proposed, and the multi-scale information of salient regions is used to represent a scene image. Aiming at the problem that the traditional artificial design features have poor distinguishing performance in scene recognition tasks, the convolution neural network migration learning strategy is used to perform scene image features in multi-scale salient regions. Experiments show that the multi-scale salient region based feature learning strategy can effectively improve the accuracy of scene recognition. In addition, convolutional neural network transfer learning features have better discriminant performance than traditional artificial design features. The key technologies are studied. For each specific problem, the structure and application mode of convolutional neural network are designed, and some effective solutions to specific problems are proposed. Experiments based on common data sets show that the proposed method can achieve better results than the classical methods in the corresponding fields. Better experimental results.
【学位授予单位】:电子科技大学
【学位级别】:博士
【学位授予年份】:2017
【分类号】:TP391.41;TP18
本文编号:2183103
[Abstract]:In recent years, the research of deep learning technology has aroused widespread interest in academia and industry, and has promoted the rapid development of a series of Applied Research in the field of artificial intelligence.As an important branch of deep learning, convolutional neural network is especially closely related to the related technology of computer vision. With the emergence of discontinuous optimization and massive data sets, convolutional neural networks have made breakthroughs in a series of computer vision applications in recent years. However, computer vision, as an area of considerable research content, still has a great deal of research in the depth and breadth of a particular technology. Research space. The field of computer vision can be divided into three levels: low-level feature research, middle-level semantic feature expression and high-level semantic understanding. The main research contents and their innovations include: Firstly, in the field of object recognition, aiming at the problem that single-column convolution neural network is easy to over-fit, the effect of heterogeneous multi-column convolution neural network in object recognition is studied. Integral neural network can effectively improve the generalization performance of the network. Aiming at the problems of single fusion mode and poor generalization performance in traditional network fusion, a network fusion strategy based on sliding window is proposed. In a single network fusion mode, sliding window fusion strategy is a more general method, and compatible with the classical fusion strategy, which can effectively improve the effect of network fusion. Second, in the field of scene annotation, a scene annotation method based on convolution neural network is proposed, and the number of indoor and outdoor scene annotations is mainstream. In view of the problem of feature learning in scene annotation, the application effect of convolution neural network features based on training and general convolution neural network features in scene annotation tasks is studied. A region consistency incentive algorithm is proposed. The region consistency incentive algorithm improves the region consistency of scene annotation iteratively by utilizing the global edge probability of scene image. Experiments on a common data set show that the region consistency incentive algorithm can achieve better scene scales than the classical algorithm. Thirdly, in the field of scene recognition, a method of scene recognition based on multi-scale salient region feature learning is proposed, and a better result of scene recognition is obtained in the experiment of common data set than that of other classical algorithms. A method of distinguishing salient regions is proposed, and the multi-scale information of salient regions is used to represent a scene image. Aiming at the problem that the traditional artificial design features have poor distinguishing performance in scene recognition tasks, the convolution neural network migration learning strategy is used to perform scene image features in multi-scale salient regions. Experiments show that the multi-scale salient region based feature learning strategy can effectively improve the accuracy of scene recognition. In addition, convolutional neural network transfer learning features have better discriminant performance than traditional artificial design features. The key technologies are studied. For each specific problem, the structure and application mode of convolutional neural network are designed, and some effective solutions to specific problems are proposed. Experiments based on common data sets show that the proposed method can achieve better results than the classical methods in the corresponding fields. Better experimental results.
【学位授予单位】:电子科技大学
【学位级别】:博士
【学位授予年份】:2017
【分类号】:TP391.41;TP18
【参考文献】
相关期刊论文 前1条
1 庄福振;罗平;何清;史忠植;;迁移学习研究进展[J];软件学报;2015年01期
,本文编号:2183103
本文链接:https://www.wllwen.com/kejilunwen/zidonghuakongzhilunwen/2183103.html