语音信号去混响技术研究
本文选题:去混响 + 复倒谱 ; 参考:《兰州交通大学》2017年硕士论文
【摘要】:日常生活中,在一些比较密闭的室内空间里,当使用电话机、助听器、手机等设备时,声源距离声音接收器的位置较远时会产生强烈的混响现象,混响的存在导致语音的各音节间出现掩蔽效应,严重影响了听觉效果。语音信号的去混响处理作为语音增强的一个重要组成部分,同时为语音合成、声源定位、语音识别等语音信号处理提供前期预处理。此外,语音信号去混响相关技术还可广泛的应用于建筑声学、振动声学、地震数据分析、生物医学、雷达声呐等其它方面。下面就是本文研究工作的主要内容:首先,分析了混响的产生、数学模型和特征参数等理论,并研究了语音去混响性能的评测指标,包括两种主观评测指标和三种客观评测指标。同时大量的仿真实验证明:两种频域客观评测方法比时域客观评测方法更适用于反映主观感受。其次,分析了复倒谱的相关理论,并在单通道的混响条件下,分别研究了复倒谱域滤波的语音去混响方法、基于最小相位分解的语音去混响方法和复倒谱盲解卷积的语音去混响方法,并对这三种方法进行了仿真分析。最后,针对单麦克风语音去混响技术仅利用时域和频域信息,而可利用的空间信息有限,很难取得较好的去混响效果,本文研究了麦克风阵列的语音去混响技术,并将波束形成技术与其它单通道语音去混响技术相结合。首先研究了基于固定波束形成的DSB语音去混响方法和基于自适应波束形成的TF-GSC语音去混响方法,并对DSB和TF-GSC方法进行仿真分析,针对它们对加性噪声和波束方向上混响的良好抑制效果,将DSB与复倒谱盲解卷积技术相结合,得到更有效的去混响方法。将TF-GSC与最小相位分解技术相结合,并针对基于TF-GSC和最小相位分解的语音去混响方法计算量较大的问题,分析了改进的基于TF-GSC和最小相位分解的语音去混响方法,该方法利用一路麦克风采集的混响语音的相位来替代全通分量的相位信息以减少运算量。并通过对仿真结果的比较,波束形成技术与其他单通道语音去混响方法相结合的方法获得了很好的去混响效果。
[Abstract]:In everyday life, in some relatively closed indoor spaces, when the telephone, hearing aid, mobile phone and other devices are used, the sound source will produce a strong reverberation phenomenon when the sound source is farther away from the sound receiver. The presence of reverberation leads to the masking effect between syllables, which seriously affects the auditory effect. As an important part of speech enhancement, the dereverberation of speech signal provides preprocessing for speech signal processing, such as speech synthesis, sound source location, speech recognition and so on. In addition, speech signal de-reverberation correlation technology can be widely used in building acoustics, vibration acoustics, seismic data analysis, biomedical, radar sonar and other aspects. The following is the main content of this paper: firstly, the theory of reverberation, mathematical model and characteristic parameters are analyzed, and the evaluation index of speech dereverberation performance is studied. Including two subjective evaluation indicators and three objective evaluation indicators. At the same time, a large number of simulation experiments show that the two frequency domain objective evaluation methods are more suitable to reflect subjective feelings than time domain objective evaluation methods. Secondly, the related theory of complex cepstrum is analyzed, and under the condition of single channel reverberation, the speech de-reverberation method in complex cepstrum domain filtering is studied respectively. The speech de-reverberation method based on minimum phase decomposition and the speech de-reverberation method based on complex cepstrum blind deconvolution are analyzed by simulation. Finally, the single microphone speech de-reverberation technology only uses the time and frequency domain information, but the available spatial information is limited, so it is very difficult to achieve good reverberation effect. In this paper, the speech de-reverberation technology of microphone array is studied. The beamforming technology is combined with other single channel speech de-reverberation technology. Firstly, the DSB speech de-reverberation method based on fixed beamforming and the TF-GSC speech de-reverberation method based on adaptive beamforming are studied, and the DSB and TF-GSC methods are simulated and analyzed. In view of their good suppression effect on additive noise and reverberation in beam direction, a more effective de-reverberation method is obtained by combining DSB with blind deconvolution of complex cepstrum. This paper combines TF-GSC with minimum phase decomposition, and analyzes the improved speech de-reverberation method based on TF-GSC and minimum phase decomposition, aiming at the problem of large computation of speech de-reverberation method based on TF-GSC and minimum phase decomposition. This method uses the phase of reverberation speech collected by a microphone to replace the phase information of the all-pass component to reduce the computation. By comparing the simulation results, the beamforming technique combined with other single-channel speech de-reverberation methods has achieved a good reverberation effect.
【学位授予单位】:兰州交通大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TN912.3
【参考文献】
相关期刊论文 前10条
1 查雪琴;;室内低频混响时间特性探讨[J];声学技术;2016年05期
2 赵益波;徐进;阮玮琪;蒋yN;;一种改进GSC自适应语音增强方法[J];科技通报;2016年09期
3 吴礼福;王华;程义;郭业才;;一种基于最大似然的混响时间盲估计方法[J];应用声学;2016年04期
4 刘敏;曾毓敏;张铭;李晨;;基于二次相关的语音信号时延估计改进算法[J];应用声学;2016年03期
5 戴红霞;赵力;;基于麦克风阵列的数字助听器语音增强技术[J];电子器件;2015年03期
6 徐进;赵益波;郭业才;;一种新的麦克风阵列自适应语音增强方法[J];应用科学学报;2015年02期
7 程志伟;;语音清晰度和平均混响时间在车内声学的应用[J];应用声学;2014年06期
8 赵红;李双田;;改进的多级线性预测晚期混响抑制算法[J];信号处理;2014年06期
9 王晓飞;姜开宇;国雁萌;付强;颜永红;;基于空间声场扩散信息的混响抑制方法[J];清华大学学报(自然科学版);2013年06期
10 孔荣;吴迪;廖启鹏;朱俊杰;周强;陶智;;采用复倒谱峰值滤波GMM识别混响语音[J];计算机工程与应用;2014年15期
相关硕士学位论文 前3条
1 彭雯雯;语音信号中混响消除算法研究[D];大连理工大学;2013年
2 行鹏程;房间混响消除的方法研究[D];沈阳理工大学;2013年
3 廖启鹏;基于Gammatone听觉滤波器组和复倒谱盲解卷积的语音去混响研究[D];苏州大学;2012年
,本文编号:1779966
本文链接:https://www.wllwen.com/kejilunwen/xinxigongchenglunwen/1779966.html