经验模态分解在单通道语音盲分离中的应用研究
发布时间:2019-01-14 08:22
【摘要】:盲源信号分离是当前国际信号处理领域的研究热点,语音分离是较早引入盲源分离技术研究的领域之一,随着移动通信和互联网技术飞速发展,语音分离技术已经在众多领域得到应用,比如语音识别系统、计算机听觉、无线通信以及电视电话会议等。经验模态分解(Empirical Mode Decomposition,EMD)是一种新型信号分析方法,突破了传统的傅里叶变换频率的局限性,在非线性非平稳信号处理领域具有强大的优势,语音信号作为一种典型的非线性非平稳信号,经验模态分解方法为语音信号处理提供了新的思路,为语音盲分离技术开辟了新的路径,本文就基于经验模态分解方法围绕单通道语音盲分离展开研究,主要作了以下工作: 为解决非负矩阵分解算法(Nonnegative Matrix Factoritzation,NMF)用于单通道语音分离的不足,避免NMF算法对混合矩阵的稀疏限制的要求,减少分离语音源信号之间的频域混叠,本文首先采用EEMD算法对混合语音信号进行处理,将信号分解成若干个具有信号瞬时特征的固有模态分量(IMF),简化其频谱结构,利用语音信号短时平稳性,针对语音信号数据量大的特性,对所得的IMF进行稀疏化处理非负矩阵分解数学模型,选用具有尺度不变性的板仓-斋藤(Itakura-Saito,IS)散度进行NMF分解,最终通过聚类算法实现语音源信号重构,通过仿真实验表明该算法在分离质量上略有改善; 单通道盲源分离理论研究远还不如传统欠定或超定盲源分离技术成熟,为解决单通道欠定盲分离难题,本文运用EEMD分解方法,将单通道混合语音转化成单入多输出的虚拟多通道,然后运用相对成熟的快速独立成分分析(FastICA)盲源分离算法进行处理,最后重构恢复出源信号,针对利用EEMD分解后得到的IMF直接进行ICA处理盲分离迭代次数过高、收敛速度慢问题,对EEMD分解得到的固有模态分量进行主成分分析方法,以达到降维目的,最后利用FastICA进行盲分离,提高算法迭代效率,最后通过仿真实验验证算法的有效性。
[Abstract]:Blind source signal separation (BSS) is a hot topic in the field of international signal processing. Speech separation is one of the fields in which blind source separation (BSS) technology was introduced earlier. With the rapid development of mobile communication and Internet technology, Speech separation technology has been applied in many fields, such as speech recognition system, computer hearing, wireless communication and video teleconference. Empirical mode decomposition (Empirical Mode Decomposition,EMD) is a new signal analysis method, which breaks through the limitation of the traditional Fourier transform frequency and has a strong advantage in the field of nonlinear non-stationary signal processing. As a typical nonlinear non-stationary signal, the empirical mode decomposition (EMD) method provides a new way for speech signal processing and a new path for speech blind separation. This paper focuses on blind speech separation based on empirical mode decomposition (EMD). The main work is as follows: in order to solve the problem of non-negative matrix decomposition (Nonnegative Matrix Factoritzation,NMF) for single-channel speech separation, In order to avoid the sparse limitation of NMF algorithm on the mixing matrix and reduce the frequency domain aliasing between the separated speech source signals, this paper first uses EEMD algorithm to process the mixed speech signals. The signal is decomposed into a number of inherent modal components with the instantaneous characteristics of the signal, (IMF), simplifies its spectral structure. By using the short-time stationarity of the speech signal, aiming at the characteristics of the large amount of data of the speech signal, The obtained IMF is sparse processed by non-negative matrix decomposition mathematical model, and Itakura-Saito,IS divergence with scale invariance is selected for NMF decomposition. Finally, the speech source signal reconstruction is realized by clustering algorithm. The simulation results show that the separation quality of the algorithm is improved slightly. The theoretical study of single channel blind source separation is far less mature than that of traditional undetermined or overdetermined blind source separation technology. In order to solve the problem of single channel blind source separation, EEMD decomposition method is used in this paper. The single-channel mixed speech is transformed into a virtual multi-channel with single input and multi-output, and then the relatively mature fast independent component analysis (FastICA) blind source separation algorithm is used to process it. Finally, the source signal is reconstructed and recovered. In order to solve the problem of high iteration number and slow convergence rate of blind separation, the IMF obtained by EEMD decomposition is directly processed by ICA. The principal component analysis method is used to reduce the dimension of the intrinsic modal component obtained by EEMD decomposition. Finally, FastICA is used for blind separation to improve the iterative efficiency of the algorithm. Finally, the effectiveness of the algorithm is verified by simulation experiments.
【学位授予单位】:西南交通大学
【学位级别】:硕士
【学位授予年份】:2014
【分类号】:TN912.3
本文编号:2408511
[Abstract]:Blind source signal separation (BSS) is a hot topic in the field of international signal processing. Speech separation is one of the fields in which blind source separation (BSS) technology was introduced earlier. With the rapid development of mobile communication and Internet technology, Speech separation technology has been applied in many fields, such as speech recognition system, computer hearing, wireless communication and video teleconference. Empirical mode decomposition (Empirical Mode Decomposition,EMD) is a new signal analysis method, which breaks through the limitation of the traditional Fourier transform frequency and has a strong advantage in the field of nonlinear non-stationary signal processing. As a typical nonlinear non-stationary signal, the empirical mode decomposition (EMD) method provides a new way for speech signal processing and a new path for speech blind separation. This paper focuses on blind speech separation based on empirical mode decomposition (EMD). The main work is as follows: in order to solve the problem of non-negative matrix decomposition (Nonnegative Matrix Factoritzation,NMF) for single-channel speech separation, In order to avoid the sparse limitation of NMF algorithm on the mixing matrix and reduce the frequency domain aliasing between the separated speech source signals, this paper first uses EEMD algorithm to process the mixed speech signals. The signal is decomposed into a number of inherent modal components with the instantaneous characteristics of the signal, (IMF), simplifies its spectral structure. By using the short-time stationarity of the speech signal, aiming at the characteristics of the large amount of data of the speech signal, The obtained IMF is sparse processed by non-negative matrix decomposition mathematical model, and Itakura-Saito,IS divergence with scale invariance is selected for NMF decomposition. Finally, the speech source signal reconstruction is realized by clustering algorithm. The simulation results show that the separation quality of the algorithm is improved slightly. The theoretical study of single channel blind source separation is far less mature than that of traditional undetermined or overdetermined blind source separation technology. In order to solve the problem of single channel blind source separation, EEMD decomposition method is used in this paper. The single-channel mixed speech is transformed into a virtual multi-channel with single input and multi-output, and then the relatively mature fast independent component analysis (FastICA) blind source separation algorithm is used to process it. Finally, the source signal is reconstructed and recovered. In order to solve the problem of high iteration number and slow convergence rate of blind separation, the IMF obtained by EEMD decomposition is directly processed by ICA. The principal component analysis method is used to reduce the dimension of the intrinsic modal component obtained by EEMD decomposition. Finally, FastICA is used for blind separation to improve the iterative efficiency of the algorithm. Finally, the effectiveness of the algorithm is verified by simulation experiments.
【学位授予单位】:西南交通大学
【学位级别】:硕士
【学位授予年份】:2014
【分类号】:TN912.3
【参考文献】
相关期刊论文 前1条
1 杨海滨;张军;;基于模型的单通道语音分离综述[J];计算机应用研究;2010年11期
相关博士学位论文 前2条
1 杨尚明;盲信号分离ICA理论与应用[D];电子科技大学;2009年
2 刘建强;非平稳环境中的盲源分离算法研究[D];西安电子科技大学;2009年
,本文编号:2408511
本文链接:https://www.wllwen.com/kejilunwen/wltx/2408511.html