贪婪算法在稀疏学习中的应用

发布时间：2018-01-28 11:43

本文关键词： 稀疏学习学习速率贪婪算法共轭梯度追踪偏复向量梯度算子　出处：《湖北大学》2016年博士论文　论文类型：学位论文

【摘要】：稀疏表示是线性表示中最具有代表性的方法,在信号处理,图像处理,机器学习,计算机视觉等领域都有广泛应用.站在不同的角度,这些方法可以被分为不同的类型.比如,根据稀疏限制中所使用的范数,这些方法可以被分成5类:1)l0范数最小化;2)lp范数最小化(0p1);3)l1范数最小化;4)l2,1范数最小化;5)l2范数最小化.根据算法的可行性,这些方法又可被分为:1)贪婪策略逼近;2)限制优化;3)基于邻近算法的优化策略;4)基于同伦算法的稀疏表示.l0范数下的稀疏表示方法能够获得稀疏解,但直接求解相应优化问题的过程是一个NP难问题,贪婪逼近策略的提出正是为了解决这个难题.与其它正则化算法相比,贪婪算法计算简便,易于实施,在计算上有着巨大的优势.贪婪算法通过在每次迭代过程中搜索局部最优解来逼近整体最优解,计算上的便利所付出的代价是算法的估计子与估计目标之间关系的不确定.但即使贪婪算法不能适用于所有问题,对许多特殊问题它仍能产生整体最优解,此时如何证明算法的有效性将是一项有意义的工作.基于贪婪算法的设计思想,本文对两类学习问题提出了不同的贪婪型算法,并分析了它们的统计表现.(1)贪婪算法在固定设计高斯回归问题中的分析对于固定设计高斯回归情形,本文提出了惩罚经验松弛贪婪算法,并且通过建立oracle不等式分析了算法的有效性.算法的主要步骤为:首先通过经验松弛贪婪算法输出估计子序列,然后通过惩罚算法的迭代次数以及字典的容量,寻找最优的估计子.P.Massart针对lasso算法建立了oracle型不等式,同时得到了一类相当广泛的模型选择引理.本文将该模型选择引理稍作推广,分别在字典有限,字典无限可数,字典无限不可数三种情形加以应用,得到了表现算法学习性能的oracle型不等式.结果显示误差界受到逼近误差,惩罚项,噪音水平的影响.特别的,当目标函数属于字典的凸包时,算法的收敛速率为O((?)).(2)贪婪算法在稀疏限制优化问题中的分析考虑了目标函数为一般损失函数(非平方损失)的稀疏限制优化问题,提出了稀疏共轭梯度法,共轭梯度追踪法及其变形算法,并分析了算法的有效性.Yuan和Lu(2009)对传统的PRP公式进行了改进,提出了新的共轭方向,在此基础上解决了无限制优化问题下,共轭梯度法的全局收敛性,获得了算法的收敛速度,且实验效果良好.受此启发,本文针对一类广泛的稀疏限制优化问题,建立了三种基于共轭梯度方向学习的贪婪型算法,理论分析显示当目标函数满足限制强凸条件与限制光滑条件时,算法均具有线性收敛速率.(3)贪婪算法在复值稀疏信号重构问题中的分析考虑了复值稀疏信号的恢复问题.由于建立在复变量上的平方损失函数在原点以外是处处不可导的,故基于梯度方向学习的算法无法直接应用在该问题上.D.H.Brandwood(1983)提出了类似于梯度算子的偏复向量梯度算子,并且发现目标函数的偏复向量梯度方向对应着函数变化最大的方向.故通过对偏复梯度方向的学习,本文提出了稀疏复梯度算法,复梯度追踪算法及其变形算法.经过分析发现当观测矩阵满足D-RIP条件时,平方损失函数具有与限制强凸条件与限制强光滑条件非常类似的性质,基于这些性质,我们得到了算法的线性学习速率.
[Abstract]:Sparse representation is a linear representation method is the most representative, in signal processing, image processing, machine learning, computer vision and other fields are widely used. From different angles, these methods can be divided into different types. For example, according to the norm of the sparse constraints, these methods can be divided into 5 categories: 1) l0 norm minimization; 2) LP norm minimization (0p1); 3) the minimum L1 norm; l2,1 norm minimization; 4) 5) L2 norm minimization. According to the feasibility of the algorithm, these methods can be divided into: 1) the greedy strategy close to 2) limit; optimization; 3) the optimization strategy based on neighbor algorithm; 4) under the.L0 norm sparse representation method to obtain sparse sparse representation based on homotopy algorithm, but the process of solving the corresponding optimization problem is a NP hard problem, greedy approximation strategies is to solve this problem. With other regularization Compared with the algorithm, greedy algorithm is simple, easy to implement, has a huge advantage in computation. The greedy algorithm by local optimal solution to approximate the global optimal solution search in each iteration, the calculation of the convenience of the price paid is not sure of the relationship between the estimator and estimate the target algorithm. But if not greedy the algorithm is applicable to all problems, it can still produce the optimal solutions for many special problems, how to prove the validity of the algorithm will be a meaningful work. The design idea based on the greedy algorithm, greedy algorithm is proposed in this paper. The different of two kinds of learning problems, and analyzes the statistical performance. (1) greedy algorithm in fixed design regression problems in the analysis of Gauss Gauss for a fixed design regression, this paper proposes empirical relaxation greedy algorithm of punishment, and through the establishment of Oracle inequality analysis The effectiveness of the algorithm. The main steps of the algorithm is as follows: firstly, through the experience of relaxation greedy algorithm output estimation sequence, and then through the iterative penalty algorithm number and capacity of the dictionary, to find the optimal estimation of the.P.Massart Oracle type inequality is established for the lasso algorithm, and get a fairly broad class of model selection in this paper lemma. The model selection lemma slightly promotion, respectively in the dictionary, the dictionary can be unlimited, unlimited number can not be a dictionary to be used, a Oracle type inequality was obtained in the performance of algorithm learning performance. The results showed that the error bound by the approximation error, penalty, effect of noise level. Especially, when the target function belongs to the convex hull of the dictionary when the convergence rate is O ((?)). (2) greedy algorithm in sparse limit analysis in optimization problems considering the objective function for general loss function (non square loss The lost) sparse constraint optimization problem, we propose a sparse conjugate gradient method, conjugate gradient method and its deformation tracking algorithm, and analyzes the effectiveness of the.Yuan algorithm and Lu (2009) on the traditional PRP formula was improved, put forward a new conjugate direction, on the basis of solving the unconstrained optimization problem under Global. The convergence of the conjugate gradient method, obtain the convergence rate of the algorithm, and the experimental result is good. Inspired by this, in this paper a wide class of sparse constraint optimization problem, set up three kinds of learning conjugate gradient direction based on greedy algorithm, theoretical analysis shows that when the objective function satisfies the restriction conditions and limitations of strong convex smooth condition and the algorithm has linear convergence rate. (3) the greedy algorithm in the complex analysis of the sparse signal reconstruction problem in consideration of the complex problem of sparse signal recovery. As established in the complex variable on the square loss function at The origin of outside is nowhere differentiable, so based on the gradient direction of learning algorithms can not be applied directly on the issue of.D.H.Brandwood (1983) made a similar gradient operator partial vector gradient operator, and found that the partial gradient direction corresponds to the direction of complex vector function changes the objective function. So based on the partial complex gradient the direction of the study, this paper presents a sparse complex gradient algorithm, algorithm and algorithm of complex deformation gradient tracking. After analysis found that when the observation matrix satisfies the D-RIP condition, the nature of the square loss function with restricted strong convexity conditions and limitations of strong smooth conditions very similar, based on these properties, we get a linear algorithm of learning rate.

【学位授予单位】：湖北大学
【学位级别】：博士
【学位授予年份】：2016
【分类号】：TP301.6

【相似文献】