具有高可懂度的维纳滤波语音增强算法

发布时间：2018-08-27 14:15

【摘要】：随着信息社会的飞速发展,智能手机以及人机语音对话设备得到了广泛应用,从而语音信号受到越来越多的关注。然而语音信号在产生、传输、处理、接收的过程中不可避免因为周围环境和传输介质的影响而受到噪声的污染。污染严重的情况下会影响语音信号的质量和可懂度,导致人或者接收语音的设备无法听懂语音。因此,需要利用语音增强技术从带噪语音信号中分离出原始纯净的语音信号,滤除噪声。传统的语音增强方法都从语音质量方向入手,使增强后的语音具有较高信噪比。但是和带噪语音相比,增强语音的可懂度没有得到有效提高。这是由于传统增强算法在滤除噪声的同时也会滤除有用的语音信号,造成语音畸变失真。由于维纳滤波可以明显提高语音质量且使增强后语音含有较少音乐噪声,本文在维纳滤波算法的基础上提出一种具有较高可懂度的改进算法,旨在提高增强后语音的可懂度,使增强后的语音信号更容易被人或者语音设备听懂理解。本文首先介绍了语音信号的常识、人的听觉特性以及噪声信号的特征,然后系统的讲述了四大类语音增强算法。总结了对于增强语音进行评价的相关方法,包括主观测听评价方法,语音质量客观评价方法和语音可懂度客观评价方法。根据维纳滤波的推导过程,得到维纳滤波器的增益函数。之后详细介绍了基于先验信噪比估计的维纳滤波方法,此方法计算过程简单,且增强后语音的质量提升明显。通过对句子和辅音语料实验仿真得到此方法虽然提高语音质量,但没有真正意义上提高增强后语音的可懂度。分析增强语音未提高语音可懂度的原因,并从剩余信噪比出发研究得到增强语音幅度谱中存在衰减畸变和放大畸变,且幅度谱大于6.02dB的放大畸变会严重影响增强语音的可懂度。通过实验把原始纯净语音的幅度谱和增强语音的幅度谱进行对比,去掉幅度谱大于6.02dB的畸变区域,增强语音的可懂度和质量相比带噪语音得到明显提升。在现实处理语音的环境中不可能有纯净语音,这就需要通过对先验信噪比进行改进。修正先验信噪比小于-10dB区域进而修正滤波算法的增益函数,然后通过已有条件判定幅度谱大于6.02dB区域,并对此区域进行约束限制,最终得到具有高可懂度的改进维纳滤波增强算法。通过对句子和辅音语料的实验仿真证实改进的算法确实提高了增强后语音的可懂度。
[Abstract]:With the rapid development of the information society, smart phones and man-machine voice dialogue devices have been widely used, so more and more attention has been paid to speech signals. However, in the process of producing, transmitting, processing and receiving speech signals, it is inevitable to be polluted by noise due to the influence of surrounding environment and transmission medium. The serious pollution will affect the quality and intelligibility of the speech signal, resulting in the person or the receiving device can not understand the speech. Therefore, it is necessary to use speech enhancement technology to separate the original pure speech signal from the noisy speech signal and filter the noise. The traditional speech enhancement methods all start from the aspect of speech quality, which makes the enhanced speech have higher SNR. However, compared with noisy speech, the intelligibility of enhanced speech is not improved effectively. This is because the traditional enhancement algorithm not only filters the noise but also filters the useful speech signal which results in the distortion of the speech. Since Wiener filter can obviously improve the speech quality and make the enhanced speech contain less music noise, this paper proposes an improved algorithm with higher intelligibility based on the Wiener filtering algorithm, which aims to improve the intelligibility of enhanced speech. Make enhanced speech signals easier to understand or understood by people or speech devices. This paper first introduces the common sense of speech signal, human auditory characteristics and the characteristics of noise signal, and then systematically describes four kinds of speech enhancement algorithms. This paper summarizes the relevant evaluation methods for enhanced speech, including subjective audiometry, objective evaluation of speech quality and objective evaluation of speech intelligibility. According to the derivation process of Wiener filter, the gain function of Wiener filter is obtained. Then the Wiener filtering method based on prior SNR estimation is introduced in detail. The method is simple and the quality of enhanced speech is improved obviously. The experimental results of sentence and consonant corpus show that this method improves speech quality but does not improve the intelligibility of enhanced speech in real sense. This paper analyzes the reasons why speech intelligibility is not improved in enhanced speech, and studies the attenuation distortion and amplification distortion in enhanced speech amplitude spectrum from the perspective of residual SNR, and the intelligibility of enhanced speech will be seriously affected by the amplification distortion of amplitude spectrum larger than that of 6.02dB. The amplitude spectrum of the original pure speech is compared with the amplitude spectrum of the enhanced speech, and the distortion region of the amplitude spectrum is removed than that of the 6.02dB, and the intelligibility and quality of the enhanced speech are obviously improved compared with the noisy speech. It is impossible to have pure speech in real speech processing environment, which needs to be improved by prior signal-to-noise ratio (SNR). A prior signal-to-noise ratio (SNR) less than -10dB region is corrected and then the gain function of the filtering algorithm is modified. Then the amplitude spectrum is determined to be larger than the 6.02dB region and the region is constrained by the existing conditions. Finally, an improved Wiener filter enhancement algorithm with high intelligibility is obtained. The experimental results of sentence and consonant corpus show that the improved algorithm can improve the intelligibility of enhanced speech.
【学位授予单位】：太原理工大学
【学位级别】：硕士
【学位授予年份】：2014
【分类号】：TN912.35

【参考文献】