基于噪声估计和掩蔽效应的语音增强

发布时间：2018-01-09 10:37

本文关键词：基于噪声估计和掩蔽效应的语音增强　出处：《西南交通大学》2014年硕士论文　论文类型：学位论文

【摘要】：数字化的语音传送、控制和识别是信息社会的基本组成部分之一。但是语音信号在获取和传送途中,都会不可避免的受到各类噪声的干扰,不仅导致接收者听到的语音质量下降,还会影响语音控制系统和识别系统的正常工作。语音数字信号处理技术已广泛地发展到了实用阶段,语音增强技术则发展为该阶段需要迫切解决的问题之一。语音增强的目的是消除噪声干扰和提高语音可懂度。针对不同类型的干扰噪声,要采用不同的语音增强策略,并且力图在抑制背景噪声的同时提高听者的舒适度。本文研究是建立在语音增强领域众多学者的优秀研究成果之上的,研究内容呈依次递进的关系,主要内容大致概括如下： 1、简要阐述了语音增强技术的基本原理和常用方法,分析了各类噪声的性质和对语音的污染情况。 2、对于平稳噪声干扰情况,本文将二次平滑引入语音活动检测(VAD)算法中进行后置处理,改善了VAD法估计平稳噪声时出现部分偏差的情况,采用维纳滤波来代替谱减法估计纯净语音,避免了“音乐噪声”的产生。在兼顾了复杂度和处理效果的情况下,该算法可以准确的估计出噪声并取得较好的增强效果。用多种非平稳噪声对该改进算法进行适用性分析,结果表明该算法更适用于处理平稳噪声。 3、对于非平稳噪声干扰这一复杂情况,本文研究分析了数据递归法(DDR),分别用vuvuzela、babble、train和car噪声对该算法进行仿真试验,验证了该算法处理噪声污染的有效性,同时也证实了本文改进的VAD方法对复杂度和有效性进行了较好的权衡。发现了适用于平稳噪声环境下的增强算法不一定适用于非平稳噪声,但适用于非平稳噪声环境下的增强算法一定适用于平稳噪声环境的规律。DDR算法的有效实现为后文理想二元掩蔽(IBM)算法的研究提供了支持。 4、提高可懂度是语音增强的重要目的。本文研究分析了能够提高可懂度的IBM算法和谐波恢复(HR)算法。IBM算法是在DDR法估计噪声方差的基础上实现的,仿真结果验证了该算法提高语音可懂度的有效性。本文采用三级分频段处理来改进了HR算法改善了传统HR法卷积运算会产生频谱混叠的问题。将IBM算法处理后的增强输出语音作为本文改进HR法的输入信号进行二次增强处理,有效提高了语音可懂度。
[Abstract]:Digital voice transmission, control and identification is one of the basic components of the information society. But the voice signal transmission way in acquiring and will be influenced by noise inevitably, not only lead to the decline of the recipient to hear the speech quality, also affect the normal work of voice control system and recognition system. The technology has been developed to the practical stage processing of digital speech signal, speech enhancement technology development is one of the urgent problems of the stage. The purpose of speech enhancement is to eliminate noise and improve speech intelligibility. Aiming at the noise of different types, with different speech enhancement strategies, and to enhance the comfort level of the listener in noise suppression at the same time.
This research is based on the excellent research results of many scholars in the field of speech enhancement, and the research contents are progressively progressively related. The main contents are summarized as follows.
1, the basic principles and common methods of speech enhancement are briefly described, and the properties of all kinds of noise and the pollution of speech are analyzed.
2, the stationary noise, this paper will introduce the two smooth voice activity detection (VAD) of the post processing algorithm, part of the deviation appears to improve the VAD method to estimate the stationary noise, using Wiener filter instead of spectral subtraction to estimate the clean speech, to avoid the "music noise" produced in the complex. And the treatment effect of the case, the algorithm can accurately estimate the noise and obtain better effects. Using a variety of non-stationary noise on the algorithm applicability analysis, the results show that the algorithm is more suitable for processing non-stationary noise.
3, for the non-stationary noise of this complex situation, this paper analyzes the data of the recursive method (DDR), respectively vuvuzela, babble, train and car noise simulation test to the algorithm, verify the validity of the algorithm to deal with noise pollution, it also proved that the improved VAD method with a good balance on the complexity and effectiveness are found. The enhancement algorithm may not be suitable for non-stationary noise for stationary noise, but is applicable to non-stationary noise environment and enhance the effective implementation of.DDR algorithm is the Yu Pingwen noise environment is the ideal two yuan masking (IBM) algorithm provides support the study.
4, improve the intelligibility of speech enhancement is an important objective. This paper analyzes can improve the intelligibility of the IBM algorithm for harmonic retrieval (HR) algorithm is.IBM algorithm in DDR estimation method based on the variance of the noise, the simulation results show that the algorithm improve the speech intelligibility is effective. The improved HR algorithm to improve the traditional HR method will produce the convolution spectrum aliasing problem using three frequency processing. This paper will enhance the output speech IBM algorithm after processing the input signal as the improved HR method was two times enhancement, effectively improve the speech intelligibility.

【学位授予单位】：西南交通大学
【学位级别】：硕士
【学位授予年份】：2014
【分类号】：TN912.35

【参考文献】