基于过完备字典的语音压缩感知投影矩阵和消噪技术研究

发布时间：2018-11-08 07:31

【摘要】：近十年来,压缩感知理论(compressed sensing)成为信号处理方向的热门研究方向,CS理论解决了传统采样机制中采样率高的难题,可以大大减少资源的浪费,仅需少量采样值即可在接收端精确或近似地重构原始信号。语音信号具有稀疏性,而如果通过引入压缩感知技术,将其和语音信号处理结合,这将会给语音信号处理领域带来新的发展。本文的研究就是基于这个前提,针对在实际的应用中语音压缩感知系统必然含有噪声,主要考虑CS系统中稀疏表示和观测矩阵的部分来研究消噪技术,以提升系统鲁棒性。本学位论文的研究内容和创新点如下:首先,详细阐述了关于压缩感知理论的研究背景知识,概括了压缩感知理论发展的数十年来各种关键技术的研究现状,总结性地介绍了语音压缩感知技术的应用与发展,本团队在前期的工作成果等。其次,从压缩感知理论涉及的稀疏基、观测矩阵和重构算法三个核心技术方面来详细地介绍。然后,重点对语音信号的特征进行研究,经过一系列的仿真实验,证实了将CS技术应用于语音信号处理中是可行的。最后,考察了含噪语音在压缩感知系统中的性能,以及噪声对CS系统各部分的影响。正是建立在这些研究的前提之上,本论文提出了一种基于FIST算法的改进K-SVD字典学习方法。通过将快速迭代收缩阈值算法引入字典训练过程,提出了基于快速迭代收缩阈值算法的K-SVD字典学习算法。该算法首先用快速迭代收缩阈值算法来完成K-SVD字典学习算法的稀疏编码阶段,更新字典则使用K-SVD的经典更新方法,稀疏编码和字典更新两步迭代学习得到新的字典。将其训练出的字典对语音信号进行稀疏化,再观测重构,并将此算法应用于语音信号的压缩感知过程。结果表明本文算法比经典的K-SVD算法字典训练速度快、RMSE低。进一步考察算法的语音去噪能力,在白噪声环境下并考察不同字典参数时的字典性能,实验结果表明本文算法比经典的K-SVD算法获得更高的输出信噪比,具有良好的去噪性能。最后,本文提出了一种设计最佳投影和获得学习字典的联合设计方法,以此来提升压缩感知应用中的重构和消噪性能。基于对一个给定的字典存在封闭的表达形式的前提,对字典SVD分解,通过数学推导得到投影矩阵的表达式,此时投影矩阵和字典相乘是一个Parseval紧框架。设计得到的最佳投影矩阵可以通过字典得到。仿真结果显示,与其他方法相比,本文提出的设计方法应用于语音信号有较好的消噪性能。
[Abstract]:In the past ten years, the compressed sensing theory (compressed sensing) has become a hot research direction in signal processing. CS theory solves the problem of high sampling rate in traditional sampling mechanism, and can greatly reduce the waste of resources. The original signal can be reconstructed accurately or approximately at the receiver with only a few sampling values. Speech signal is sparse, but if compression sensing technology is introduced and combined with speech signal processing, it will bring new development to the field of speech signal processing. The research of this paper is based on this premise. In order to improve the robustness of the CS system, the sparse representation and the observation matrix are considered in order to improve the robustness of the system. The research contents and innovations of this dissertation are as follows: firstly, the background knowledge of the theory of compressed perception is described in detail, and the research status of various key technologies in the development of the theory of compressed perception is summarized. This paper summarizes the application and development of speech compression perception technology, the team's previous work and so on. Secondly, the sparse basis, observation matrix and reconstruction algorithm of compressed sensing theory are introduced in detail. After a series of simulation experiments, it is proved that it is feasible to apply CS technology to speech signal processing. Finally, the performance of noisy speech in compression sensing system and the effect of noise on each part of CS system are investigated. Based on these researches, this paper proposes an improved K-SVD dictionary learning method based on FIST algorithm. By introducing the fast iterative shrinkage threshold algorithm into the dictionary training process, a K-SVD dictionary learning algorithm based on the fast iterative contraction threshold algorithm is proposed. The algorithm uses the fast iterative shrinkage threshold algorithm to complete the sparse coding phase of the K-SVD dictionary learning algorithm, and the update dictionary uses K-SVD 's classical updating method, sparse coding and dictionary updating two-step iterative learning to obtain the new dictionary. The dictionary is used to sparse the speech signal, and then the algorithm is applied to the process of speech signal compression and perception. The results show that the proposed algorithm is faster than the classical K-SVD algorithm in dictionary training speed and lower in RMSE. Furthermore, the speech denoising ability of the algorithm and the dictionary performance under white noise and different dictionary parameters are investigated. The experimental results show that the proposed algorithm has higher output SNR than the classical K-SVD algorithm. It has good denoising performance. Finally, a joint design method of optimal projection and learning dictionary is proposed to improve the performance of reconstruction and de-noising in compression sensing applications. Based on the premise that there is a closed representation for a given dictionary, the SVD decomposition of the dictionary is used to derive the expression of the projection matrix by mathematical derivation. In this case, the multiplying of the projection matrix and the dictionary is a Parseval compact frame. The optimal projection matrix can be obtained by dictionary. The simulation results show that compared with other methods, the proposed design method has better denoising performance.
【学位授予单位】：南京邮电大学
【学位级别】：硕士
【学位授予年份】：2017
【分类号】：TN912.3

【参考文献】