面向海量时间序列遥感图像的字典学习算法研究

发布时间：2018-06-05 03:00

本文选题：稀疏表示 + 字典学习　；参考：《北京工业大学》2014年硕士论文

【摘要】：图像稀疏表示理论的研究已成为近几年来图像处理领域的研究热点。研究主要基于字典学习算法的设计、快速有效的稀疏表示算法以及该理论在图像处理中的应用。将图像信息转化到稀疏域，很多方面可以大大简化后续图像的分析、处理过程，对图像处理领域的研究具有重要的理论意义。本文在中科院“一三五”规划项目“空间大数据与数据密集型计算”的资助下，借助海量遥感数据在时间维高度冗余的特性进行海量相似高冗余数据的字典学习算法的研究，从而更加稀疏、高效地实现该类数据的稀疏表示，推动了图像稀疏表示领域研究的发展。首先，本文分析了传统的字典学习算法不适应海量数据的原因，借鉴增量学习思想并结合K-SVD字典学习算法提出了一种可以分批训练样本的增量K-SVD字典学习算法。突破了传统算法需要将样本集中进行训练的缺点。该算法将每幅图像看做一个小样本，对每次添加的样本有选择地训练原子并添加到原子库（字典）中。这样，随着样本的不断添加，字典中的原子特性越丰富，既能有效地表示当前新样本又不影响对原始样本的表示效果，从而实现对海量样本的字典学习。其次，提出了一种基于信息熵的字典原子初值遴选方法。字典学习过程中首先字典中的原子需要设置初值，本文通过计算每列稀疏系数的熵值判断稀疏系数分布的差异情况，，熵值较大的稀疏系数列对应的信号即为不易被稀疏表示的结构，将该类信号设为原子的初始值，使得训练出的原子更加符合当前样本结构，且丰富了原子库信息。然后，对字典训练过程进行去相干处理。本文研究了多种字典去相干方法，提出一种动态去字典相干性的模型。该模型在字典学习过程中引入相干参数作为判决条件。对于动态添加的原子组合，判断其对字典相干性的影响。对使得字典相干性高于阈值的原子组合，算法首先利用迭代投影方法确保字典满足相干参数，然后对该原子组合在稀疏逼近残差的目标函数下进行迭代旋转并且不影响字典的相干性。保证字典相干性的同时，使得原子逼近训练样本。最后，本文选取大量的时间序列上的Landsat遥感卫星数据做实验样本。将本文算法与另外两种可以训练动态数据集的字典学习算法做比较。实验结果表明，本文算法能够更加有效、更加稀疏地表示出原始数据。
[Abstract]:Image sparse representation theory has become a hotspot in the field of image processing in recent years. The research is mainly based on the design of dictionary learning algorithm, the fast and effective sparse representation algorithm and the application of this theory in image processing. To transform image information into sparse domain, many aspects can greatly simplify the analysis and processing process of subsequent images, which has important theoretical significance for the research of image processing field. In this paper, with the aid of "Spatial big data and Data-intensive Computing", a planning project of the 13th Five-Year Plan of the Chinese Academy of Sciences, a dictionary learning algorithm for massive similar and highly redundant data is studied with the help of the characteristics of massive remote sensing data with high redundancy in time dimension. Therefore, the sparse representation of this kind of data is realized more sparsely and efficiently, which promotes the development of image sparse representation field. Firstly, this paper analyzes the reason why the traditional dictionary learning algorithm can not adapt to the massive data, and proposes an incremental K-SVD dictionary learning algorithm based on the incremental learning idea and the K-SVD dictionary learning algorithm. It breaks through the shortcoming of the traditional algorithm which needs to train the sample set. The algorithm treats each image as a small sample and selectively trains atoms and adds them to the atomic library (dictionary). In this way, with the continuous addition of samples, the more atomic properties in the dictionary, can effectively represent the current new samples without affecting the performance of the original samples, so as to achieve the dictionary learning of massive samples. Secondly, a dictionary atomic initial value selection method based on information entropy is proposed. In the process of dictionary learning, first of all, the atoms in the dictionary need to set initial values. In this paper, the difference in the distribution of sparse coefficients is determined by calculating the entropy value of each column of sparse coefficients. The signal corresponding to the sparse coefficient column with large entropy value is a structure that is not easily represented by sparse representation. By setting this kind of signal as the initial value of the atom, the trained atom conforms to the current sample structure and enriches the atomic library information. Then, the dictionary training process is de-coherent. In this paper, we study a variety of dictionary de-coherence methods, and propose a dynamic de-coherence model. In this model, coherent parameters are introduced as decision conditions in dictionary learning process. For dynamically added atomic combinations, the effect on dictionary coherence is judged. For the combination of atoms whose coherence is higher than the threshold value, the iterative projection method is used to ensure that the dictionary satisfies the coherent parameters. Then the atomic combination is rotated iteratively under the objective function of sparse approximation residuals without affecting the coherence of the dictionary. At the same time, the atoms approach the training sample while ensuring the consistency of the dictionary. Finally, a large number of Landsat remote sensing satellite data on time series are selected as experimental samples. This algorithm is compared with the other two dictionary learning algorithms which can train dynamic data sets. Experimental results show that the proposed algorithm is more efficient and more sparse to represent the original data.
【学位授予单位】：北京工业大学
【学位级别】：硕士
【学位授予年份】：2014
【分类号】：TP751

【参考文献】