面向语音增强的约束序贯高斯混合模型噪声功率谱估计

发布时间：2018-07-04 16:47

本文选题：高斯混合模型 + 功率谱估计　；参考：《声学学报》2017年05期

【摘要】：提出了一种基于极大似然的噪声对数功率谱估计方法,采用高斯混合模型对每一个频带上的功率谱包络构建统计模型,将时序包络划分为语音和非语音类,它们分别对应于高斯混合模型的两个高斯分量,描述语音和非语音的统计分布,其中非语音高斯分量的均值即为噪声功率谱的最优估计.采用序贯学习的方法,在极大似然准则下逐帧更新模型参数,并逐帧给出噪声功率谱的最优估计值。此外,由于序贯更新过程中语音信号长时缺失,容易导致模型失稳,提出了一种在线的最小描述长度准则(MDL)来判断语音信号是否长时缺失,从而保证了模型的稳定性.实验表明,算法性能整体优于经典的MS和IMCRA算法。
[Abstract]:In this paper, a noise logarithmic power spectrum estimation method based on maximum likelihood is proposed. The Gao Si hybrid model is used to construct a statistical model for the power spectrum envelope in each frequency band, and the time series envelope is divided into speech and non-speech classes. They correspond to two Gao Si components of Gao Si mixed model, and describe the statistical distribution of speech and non-speech. The mean value of non-speech Gao Si component is the optimal estimation of noise power spectrum. Using sequential learning method, the model parameters are updated frame by frame under the maximum likelihood criterion, and the optimal estimation of noise power spectrum is given. In addition, due to the long time loss of speech signal in sequential updating process, it is easy to lead to model instability. An online minimum description length criterion (MDL) is proposed to judge whether the speech signal is long time missing or not, so as to ensure the stability of the model. Experiments show that the performance of the algorithm is better than that of the classical MS and IMCRA algorithms.
【作者单位】：江西理工大学信息工程学院;北京理工大学多元信息系统实验室;中国科学院声学研究所语言声学与内容理解重点实验室;国家计算机网络应急技术处理协调中心;
【基金】：江西省教育厅科技项目(GJJ150681) 江西理工大学自然科学基金项目(NSFJ2015-G21) 国家重点基础研究发展计划项目(2013CB329302) 国家自然科学基金项目(61271426,10925419,90920302,61072124,11074275,11161140319,91120001) 中国科学院战略性先导科技专项(XDA06030100,XDA06030500);中国科学院重点部署项目(KGZD-EW-103-2)资助
【分类号】：TN912.3

【相似文献】