基于层次特征的视觉注意模型研究

发布时间：2018-06-22 08:52

本文选题：视觉注意 + 自顶向下　；参考：《华中科技大学》2016年硕士论文

【摘要】：人在看一副图像时,会不自觉的关注图像中某些区域,同时忽略某些区域。这种视觉感知过程中表现出的选择性是视觉注意机制作用的结果。在计算机视觉研究中,通过对视觉注意机制进行建模,可以赋予计算机在复杂环境中自动获取人的视觉兴趣区的能力。通过模拟人类视觉感知系统,研究人员提出了基于特征整合的视觉注意计算框架。在此框架下衍生出了多种视觉注意模型。本文详细分析Itti和Judd两种具有较大影响力的显著性视觉注意模型。其中Itti模型通过整合多种低层次特征生成显著图作为图像区域受关注程度的预测。该方法忽略视觉注意过程中知识、任务、偏好等因素的影响。Judd模型整合高层次语义特征作为知识的引入方式。虽然取得了较好的效果,但是启发式特征的设计和计算较复杂,扩展性不强。本文在现有视觉注意模型基础上重点研究了两个问题:(1)如何通过学习方法获取视觉注意特征。(2)如何在特征整合框架下进行层次特征整合。首先,本文通过训练卷积神经网获取像素级、对象级、语义级特征。然后,基于学习获取的特征,提出了一种整合层次特征的视觉注意模型,重点在于利用对象属性信息进行高层次特征整合,该方法有效弥补了已有模型在引入先验知识方面的不足。最后,针对提出的视觉注意模型,设计了一种层次知识引导的注意焦点转移方法。实验表明,新模型充分利用了先验知识,在多个数据集上测试均获得了较好的实验结果。
[Abstract]:When you look at an image, you will unconsciously focus on some areas of the image, while ignoring some areas. The selectivity of visual perception is the result of visual attention mechanism. In the research of computer vision, by modeling the visual attention mechanism, the computer can automatically acquire the region of visual interest in complex environment. By simulating human visual perception systems, researchers proposed a visual attention computing framework based on feature integration. Under this framework, several visual attention models are derived. Two significant visual attention models, Itti and Judd, are analyzed in detail. The Itti model uses a variety of low-level features to generate salient maps as a prediction of the attention level of the image region. This method ignores the influence of knowledge, task, preference and other factors in visual attention. Judd model integrates high-level semantic features as a way to introduce knowledge. Although good results have been obtained, the design and calculation of heuristic features are more complicated and less extensible. Based on the existing visual attention models, this paper focuses on two problems: (1) how to acquire visual attention features through learning methods; (2) how to integrate hierarchical features in the framework of feature integration. Firstly, the features of pixel level, object level and semantic level are obtained by training convolution neural network. Then, based on the features acquired by learning, a visual attention model integrating hierarchical features is proposed, which focuses on the high-level feature integration using object attribute information. This method effectively makes up for the deficiency of the existing models in introducing prior knowledge. Finally, aiming at the proposed visual attention model, a method of attention focus shift based on hierarchical knowledge guidance is designed. Experiments show that the new model makes full use of prior knowledge, and good experimental results are obtained by testing on multiple data sets.
【学位授予单位】：华中科技大学
【学位级别】：硕士
【学位授予年份】：2016
【分类号】：TP391.41

【参考文献】