基于语义分析的场景分类方法研究

发布时间:2017-12-28 03:16

  本文关键词:基于语义分析的场景分类方法研究 出处:《哈尔滨工业大学》2017年博士论文 论文类型:学位论文


  更多相关文章: 场景分类 自适应样本选择 认知模型 知识表示与推理


【摘要】:场景理解是计算机视觉理论研究和技术应用所要挑战的目标之一,包括场景分类、图像分割、目标检测与标注等诸多技术,其中,场景分类是实现场景理解的先决条件,在视频监控、机器人导航与决策等视觉应用中有着不可或缺的作用。研究场景分类技术是计算机视觉、机器学习和模式识别等领域的重要课题。近年来,随着计算技术及图像传感器的快速发展,拓展了图像采集方式并促进了视觉领域的发展。例如,流行的图像分享网站如Flickr存储的图像数量已超过六十亿,知名图像社交网站Instagram的活跃用户数量突破了一亿。与此同时,越来越多的设备具有了获取图像的能力,掀起了智能设备普及的浪潮,扩展了设备的应用场景和范围。丰富的图像数据可为用户提供更优质的信息资源,但大量的图像数据使手工分类越来越难以满足日益增长的需求,也不符合设备智能化的趋势。因此研究场景分类方法实现类别自动标注,是提高图像检索效率、拓展视觉智能应用的必要途径。现有场景分类方法主要包括基于底层视觉特征的分类方法和基于知识语义的推理方法。这些分类方法利用视觉特征训练视觉分类器完成分类任务,通常在小规模样本集上有较好的效果。主要不足在于,底层视觉特征与人类理解的高层语义间存在语义鸿沟,不能很好地描述图像;基于知识语义的方法在构造知识库与推理时偏重于采用语义属性而忽视了视觉属性的重要作用。本文针对场景分类问题,提出了包括图像样本选择、语义层次扩展视觉词包图像描述、场景结构分析以及视觉属性知识库构建在内的一套完整的理论体系。主要创新性工作有:1.从视觉认知角度出发,提出一种样本自动收集方法,解决基于不确定性主动学习方法未考虑样本类别分布,且需要对所选样本进行额外标注的问题。将基于视觉词包的确定性评价引入到基于熵的不确定性度量中,使主动学习方法能够在有效地收集样本的同时对样本类别进行自动标注。另外,利用认知心理学中负加速学习理论对迭代停止条件进行自适应调节,在训练过程中通过样本相似性度量对不同类别样本设置不同的权值,并在迭代过程中更新,从而提高收敛速度。实验结果表明,该方法能够提高样本收集效率,用该方法收集的样本训练分类器能够提高分类性能。2.提出了语义层次扩展场景分类方法,解决底层视觉特征存在语义鸿沟不能有效描述图像高层语义的问题。通过引入抽象语义对词包模型进行多层次扩展,提出语义保留方法在词包模型构造的初级视觉词典基础上生成具有高语义层级的视觉词典。利用自底向上的方式逐层传递语义,训练上层语义分类器,从而提高词包模型的描述能力。分类时采用自顶向下方式逐层判断待测样本的类别。实验结果表明,提出的方法与其他分类方法相比具有更好的分类性能。3.提出了一种室内场景层次结构,解决不同类别室内场景装饰多变且类别间具有相似性的不利于分类器训练的问题。不同类别的室内场景间具有相似性,而相同类别的室内场景具有相异性。本文根据人类的认知规律及室内场景的特点,提出了一种场景层次结构。通过层次检测方法自动划分层次结构并用层次语义表述室内场景的结构。与已有分类方法相比,所提出的层次结构能够更好地描述室内场景,从而能够提高场景分类性能。4.在室内场景结构检测的基础上,提出高层知识库构建方法对室内场景进行分类。室内场景分类是场景交互的前提,基于一阶逻辑的方法在构造知识库的过程中忽略了普遍存在的层次结构和视觉属性。针对上述不足,提出一种基于马尔科夫逻辑网的室内场景知识表示与推理方法,通过引入上述场景层次结构与视觉属性构造高层知识库来提高知识库的描述能力。实验结果表明,所构造的知识库具有鲁棒性,并且能够有效地对室内场景进行分类。本文针对场景分类问题,在样本选择、语义扩展视觉词包图像描述、场景结构分析和视觉属性知识库构建等方面开展研究。提出的方法有机地构成场景分类框架,提高了场景分类性能。
[Abstract]:Scene understanding is one of the goals of computer vision theory and technology applied to the challenges, including scene classification, image segmentation, target detection and labeling of many technologies, the scene classification is a prerequisite for scene understanding, plays an indispensable role in video surveillance, robot navigation and decision vision application. The research of scene classification is an important subject in the fields of computer vision, machine learning and pattern recognition. In recent years, with the rapid development of computing technology and image sensors, the way of image acquisition has been expanded and the development of visual field has been promoted. For example, the number of images stored on popular image sharing sites, such as Flickr, has exceeded six billion. The number of active users of well-known image social networking sites Instagram has exceeded one hundred million. At the same time, more and more devices have the ability to obtain images, set off a wave of the popularization of intelligent equipment, and expand the application scene and scope of the equipment. Rich image data can provide users with better information resources, but a large number of image data makes manual classification more and more difficult to meet the growing demand, and also does not conform to the trend of intelligent devices. Therefore, it is a necessary way to improve the efficiency of image retrieval and expand the application of visual intelligence by studying the classification of scene automatically. The existing scene classification methods mainly include the classification method based on the underlying visual features and the reasoning based on knowledge semantics. These classification methods use visual features to train visual classifiers to perform classification tasks, and usually have good results on small scale sample sets. The main disadvantage is that there is a semantic gap between the underlying visual features and the high-level semantics of human understanding, which cannot describe the image very well. The method based on knowledge semantics emphasizes the use of semantic attributes while ignoring the importance of visual attributes when constructing knowledge bases and reasoning. Aiming at the problem of scene classification, this paper proposes a complete theoretical system including image sample selection, semantic level expansion, visual word package image description, scene structure analysis and visual attribute knowledge base construction. The main innovative works are as follows: 1., from the perspective of visual cognition, a sample automatic collection method is proposed to solve the problem of uncertain sample based on active learning, without considering the distribution of sample classes, and the need to annotate the selected samples. The deterministic evaluation based on visual word package is introduced into the entropy based uncertainty measurement, so that the active learning method can effectively collect samples while automatically marking the sample categories. In addition, the negative acceleration learning theory in cognitive psychology is used to adaptively adjust the iteration stop condition. In training process, different weights are set for different classes of samples in the training process, and update them in the iteration process, so as to improve the convergence speed. The experimental results show that the method can improve the sample collection efficiency, and the sample training classifier collected by this method can improve the classification performance. 2. a semantic hierarchical extended scene classification method is proposed to solve the problem that the underlying semantic gap in the underlying visual features can not effectively describe the high level semantics of the image. By introducing abstract semantics, the word bag model is expanded at various levels, and a semantic retention method is proposed. Based on the primary visual dictionary constructed by the word bag model, a visual dictionary with high semantic level is generated. The semantic classifier is trained layer by layer to train the upper level semantic classifier, so as to improve the description ability of the word packet model. A top-down approach is used to determine the categories of the samples to be measured in a top-down manner. The experimental results show that the proposed method has better classification performance compared with other classification methods. 3., a hierarchical structure of indoor scenes is proposed to solve the problem of different classes of indoor scenes with varied decoration and similarity among categories, which is not conducive to the training of classifiers. Different categories of indoor scenes have similarities, while the same category of indoor scenes is different. Based on the human cognitive law and the characteristics of the indoor scene, this paper presents a hierarchical structure of the scene. The hierarchical structure is automatically divided by the hierarchical detection method and the level semantics is used to express the structure of the indoor scene. Compared with the existing classification methods, the proposed hierarchical structure can better describe the indoor scene, thus improving the performance of the scene classification. 4. on the basis of the detection of the indoor scene structure, this paper puts forward a high level knowledge base construction method to classify the indoor scene. Indoor scene classification is the premise of scene interaction. The first order logic method ignores the hierarchical structure and visual attributes in the process of constructing knowledge base. In view of these shortcomings, a knowledge representation and reasoning method of indoor scene based on Markoff logic network is proposed. By introducing the above scene hierarchical structure and visual attribute, we build high-level knowledge base to improve the description ability of knowledge base. The experimental results show that the constructed knowledge base is robust and can effectively classify the indoor scene. Aiming at scene classification problem, this paper researches on sample selection, semantic extension, visual word package image description, scene structure analysis and visual attribute knowledge base construction. The proposed method organically forms the scene classification framework to improve the performance of the scene classification.
【学位授予单位】:哈尔滨工业大学
【学位级别】:博士
【学位授予年份】:2017
【分类号】:TP391.41

【参考文献】

相关期刊论文 前5条

1 黄凯奇;任伟强;谭铁牛;;图像物体分类与检测算法综述[J];计算机学报;2014年06期

2 张素兰;郭平;张继福;胡立华;;图像语义自动标注及其粒度分析方法[J];自动化学报;2012年05期

3 张琳波;王春恒;肖柏华;邵允学;;基于Bag-of-phrases的图像表示方法[J];自动化学报;2012年01期

4 徐从富;郝春亮;苏保君;楼俊杰;;马尔可夫逻辑网络研究[J];软件学报;2011年08期

5 危辉,潘云鹤;从知识表示到表示:人工智能认识论上的进步[J];计算机研究与发展;2000年07期



本文编号:1344346

资料下载
论文发表

本文链接:https://www.wllwen.com/shoufeilunwen/xxkjbs/1344346.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户8d253***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com