字典学习和稀疏表示的无监督语音增强算法
发布时间:2018-11-07 21:09
【摘要】:针对非结构噪声难以去除的问题,基于字典训练和稀疏表示提出一种无监督语音增强算法。该算法通过构造过完备字典并使用带噪语音样本对其进行训练来实现。首先指出K-奇异值分解算法(K-SVD)存在的不足并提出一种新的改进的字典训练算法:K-双边随机投影算法(K-BRP);然后使用K-BRP算法不断更新字典矩阵和相应的增益系数矩阵,从被非结构化噪声所污染的带噪语音中提取出结构性强的纯净语音。大量实验结果表明,由于训练样本考虑到了语音信号的时频域局部结构特征,所提算法能够很好地消除随机噪声,并且在低信噪比情况下仍然能够保持较高的语音质量和可懂度。
[Abstract]:An unsupervised speech enhancement algorithm based on dictionary training and sparse representation is proposed. The algorithm is implemented by constructing a complete dictionary and using noisy speech samples to train it. Firstly, the shortcomings of the K-SVD algorithm (K-SVD) are pointed out, and a new improved dictionary training algorithm: the K-bilateral Random projection algorithm (K-BRP) is proposed. Then the dictionary matrix and the corresponding gain coefficient matrix are updated by K-BRP algorithm to extract the strong structural pure speech from the noisy speech contaminated by unstructured noise. A large number of experimental results show that the proposed algorithm can eliminate random noise very well because the training samples take into account the local structural characteristics of speech signals in time and frequency domain. And it can still maintain high speech quality and intelligibility under low SNR.
【作者单位】: 解放军理工大学指挥信息系统学院;
【基金】:江苏省自然科学基金资助项目(BK2012510)
【分类号】:TN912.3
[Abstract]:An unsupervised speech enhancement algorithm based on dictionary training and sparse representation is proposed. The algorithm is implemented by constructing a complete dictionary and using noisy speech samples to train it. Firstly, the shortcomings of the K-SVD algorithm (K-SVD) are pointed out, and a new improved dictionary training algorithm: the K-bilateral Random projection algorithm (K-BRP) is proposed. Then the dictionary matrix and the corresponding gain coefficient matrix are updated by K-BRP algorithm to extract the strong structural pure speech from the noisy speech contaminated by unstructured noise. A large number of experimental results show that the proposed algorithm can eliminate random noise very well because the training samples take into account the local structural characteristics of speech signals in time and frequency domain. And it can still maintain high speech quality and intelligibility under low SNR.
【作者单位】: 解放军理工大学指挥信息系统学院;
【基金】:江苏省自然科学基金资助项目(BK2012510)
【分类号】:TN912.3
【相似文献】
相关期刊论文 前10条
1 刘晓山;付国兰;;基于脊波变换的图像压缩[J];电脑与信息技术;2007年02期
2 刘晓山;付国兰;;基于脊波变换和SPIHT算法相结合的图像压缩[J];江西师范大学学报(自然科学版);2007年06期
3 王华丹;刘海林;;稀疏盲源分离问题的恢复性研究[J];广东工业大学学报;2008年02期
4 谈华f,
本文编号:2317577
本文链接:https://www.wllwen.com/kejilunwen/wltx/2317577.html