基于集合表示的图像分类

发布时间：2018-07-29 11:43

【摘要】：图像分类的实质性问题是识别图像中的物体或目标,这就需要准确的对图像中的视觉信息进行描述。局部信息由于其对背景细节、光照等外在条件的鲁棒性使其成为目前特征表示的主流,尤其是在尺度不变特征变换以及基于尺度不变特征变换各种改进算法出现之后。然而不同图像局的部特征的个数往往不相同,不适于直接在局部特征上进行分类和检索等后续操作,因此在图像的局部特征集合上需求统一的集合表示方法。集合表示就是用一定的方法对图像提取的所有局部特征点进行操作,形成一个矢量来表示该图像。本文的主要工作和贡献如下:首先,本文从图像的集合表示角度,详细阐述了三种集合表示方法即词袋模型、高效匹配核和局部聚合描述符,并且基于这三种集合表示方法在本文选定的数据库上做了大量实验,验证三种集合表示方法的分类性能。其次,验证不同的聚类算法和聚类中心个数对最新提出的局部聚合描述符图像集合表示方法适用性。本文根据聚类中心个数的选定方式和局部特征的分配方式的不同,选用K-means、仿射传播算法和高斯混合模型三种聚类算法。最后,对局部聚合描述符提出自己的改进方法。在本文全面研究了归一化和pooling两种操作对局部聚合描述符的作用和有效性。归一化的方式选用power-law和L2范数,pooling 方法采用 sum pooling、average pooling 和广义的 max pooling。PPMI、Caltech-101和Scene-15分别是关于动作、物体和场景的数据库,在这三个数据库上验证了上述方法的有效性。
[Abstract]:The essential problem of image classification is to identify the object or object in the image, which requires the accurate description of the visual information in the image. Because of its robustness to background details, illumination and other external conditions, local information has become the mainstream of feature representation, especially after the emergence of various improved scaling invariant feature transformation and scale-invariant feature transformation. However, the number of local features in different image bureaus is often different, which is not suitable for local features classification and retrieval. Therefore, a unified set representation method is required on the local feature sets of images. Set representation is to operate all the local feature points extracted from the image by a certain method and form a vector to represent the image. The main work and contributions of this paper are as follows: firstly, from the point of view of image set representation, three sets representation methods, namely word bag model, efficient matching kernel and local aggregation descriptor, are described in detail. Based on these three sets representation methods, a lot of experiments are done on the selected database to verify the classification performance of the three sets representation methods. Secondly, the applicability of different clustering algorithms and the number of clustering centers to the newly proposed local aggregation descriptor image set representation method is verified. According to the difference of the number of clustering centers and the distribution of local features, K-means, affine propagation algorithm and Gao Si hybrid model are selected in this paper. Finally, an improved method is proposed for the local aggregation descriptor. In this paper, we study the effect and validity of normalized and pooling operations on local aggregation descriptors. Power-law and L2 norm pooling methods are used to normalize sum poolingaverage pooling and generalized max pooling.PPMIM Caltech-101 and Scene-15 are respectively databases on actions, objects and scenes. The effectiveness of the above methods is verified on these three databases.
【学位授予单位】：哈尔滨工程大学
【学位级别】：硕士
【学位授予年份】：2016
【分类号】：TP391.41

【参考文献】