基于LDA主题模型的图像场景分类研究

发布时间：2018-07-02 10:39

本文选题：LDA主题模型 + K-means++　；参考：《中北大学》2017年硕士论文

【摘要】：图像分类作为计算机视觉领域研究的方向之一,是其他图像应用领域的基础。此类问题的一个重要解决途径就是图像场景分类技术。本文主要针对现有的基于LDA主题模型图像场景分类技术中存在的一些问题,提出新的改进的方法,以提高LDA主题模型对图像场景分类的准确率和执行效率。潜在狄利克雷分布(Latent Dirichlet Allocation,LDA)主题模型是当前广泛使用的一种图像处理方法,它将图像的底层局部特征抽象成为视觉单词,生成视觉词典,统计视觉单词出现的频率进而建立中层语义表示模型,对图像进行表示。之后通过分类器自动标记图像场景标签,实现图像的自动分类。针对传统模型在进行图像场景识别时存在的问题,本文进行了如下研究:1.针对传统模型在进行图像场景识别时使用的聚类方法效率较低的问题,采用KMeans++聚类算法生成视觉单词。2.传统方法表示图像时未考虑单词的权重问题,使得学习得到的主题分布倾向高频词,针对视觉单词出现的幂律分布问题,本文使用加权统计直方图进行图像表示。3.在进行图像场景识别时不能有效利用图像主要特征的问题,引入特征函数,在图像场景识别模型的方法中加强重要特征在分类识别中的作用,提出有特征函数的潜在狄利克雷分布(Featured Latent Dirichlet Allocation,FLDA)主题模型,提高图像场景的分类和识别效率。4.LDA模型中的参数很难直接估计,针对这个问题,提出了一种改进的变分推理方法,即快速变分推理(Fast Variational Inference,FVI),减少模型中参数的迭代次数,减少计算成本,提高模型的执行效率。通过对不同数据集上的多次实验结果进行分析可知,本文提出的FLDA模型和快速变分推理算法,能够有效的提高基于主题模型的图像场景分类的准确率和执行效率,并且具有一定的通用性和稳定性。
[Abstract]:As one of the research directions in the field of computer vision, image classification is the basis of other image application fields. One of the most important ways to solve this problem is image scene classification. In order to improve the accuracy and efficiency of image scene classification based on LDA topic model, this paper proposes a new and improved method to solve some problems existing in the existing image scene classification technology based on LDA topic model. The topic model of latent Dirichlet allocation LDA (LDA) is a widely used image processing method, which abstracts the underlying local features of images into visual words and generates visual dictionaries. Then the middle level semantic representation model is established to represent the image by counting the frequency of visual words. Then the image scene label is automatically marked by classifier to realize the automatic image classification. Aiming at the problems of traditional model in image scene recognition, this paper does the following research: 1: 1. Aiming at the low efficiency of the traditional clustering method used in image scene recognition, KMeans clustering algorithm is used to generate visual words. 2. The traditional method does not consider the weight of words, which makes the topic distribution tend to high frequency words. In view of the power law distribution of visual words, the weighted statistical histogram is used to represent the image. 3. In the process of image scene recognition, the main features of the image can not be effectively utilized. The feature function is introduced to strengthen the important features in the classification and recognition of the image scene recognition model. In this paper, a Featured Latent Dirichlet allocation / FLDA subject model is proposed, which improves the classification and recognition efficiency of images. 4. The parameters in LDA model are difficult to estimate directly. In order to solve this problem, an improved variational reasoning method is proposed. Fast variational inference (FVI) reduces the number of iterations of parameters in the model, reduces the computational cost and improves the execution efficiency of the model. Through the analysis of many experiments on different data sets, we can see that the FLDA model and the fast variational reasoning algorithm proposed in this paper can effectively improve the accuracy and efficiency of image scene classification based on topic model. And has certain universality and stability.
【学位授予单位】：中北大学
【学位级别】：硕士
【学位授予年份】：2017
【分类号】：TP391.41

【参考文献】