语音识别中抗噪方法的研究
发布时间:2018-05-30 22:10
本文选题:语音识别 + 抗噪 ; 参考:《中国矿业大学》2014年硕士论文
【摘要】:目前,语音识别系统已经在理想的环境下获得了不错的成绩,但是存在于应用环境中的各种干扰信号,导致系统的识别能力大幅度下降。由此可见,去噪技术已经成为语音识别系统能否在生活中完美应用的关键,同时也是语音识别领域要攻克的热点问题。目前,语音识别中主要的抗噪方法分为语音增强技术、抗噪特征提取技术和模型补偿技术三个方向,在本文中,,结合前两种技术提出了一种组合的去噪方法来进一步提高系统的鲁棒性。 首先,在语音增强技术方面,通过分析硬阈值函数、软阈值函数、软硬阈值折中函数和Garrote阈值函数的优缺点,构造出了一种改进的阈值函数,这个函数同时具备了以上几种函数的优点。然后通过Matlab仿真验证了该函数的可行性与有效性。 其次,在抗噪特征提取方面,通常采用MFCC参数和基于小波多分辨率分析改进的MFCC参数。由于MFCC参数提取过程中的FFT变换在时域和频域分析窗是不变化的,这就违背了语音信号非平稳性的特点;而基于小波多分辨率分析的MFCC参数只分解变换后的低频部分,高频部分却不做任何操作。针对这两个缺陷,本文给出了一种改进的基于小波包分析的特征提取方法,并验证了这种方法具有较好的识别结果。 最后,在性能分析部分,首先基于本文的组合去噪方法构建了一个非特定人、孤立词、小词汇量的语音识别系统,然后在几种不同信噪比环境下,经过对比不同系统的识别率,验证了该组合去噪方法的有效性。
[Abstract]:At present, the speech recognition system has achieved good results in the ideal environment, but there are various interference signals in the application environment, which leads to a great decline in the recognition ability of the system. Thus, denoising has become the key to the perfect application of the speech recognition system in life, and it is also the field of speech recognition. At present, the main anti noise methods in speech recognition are divided into three directions: speech enhancement technology, anti noise feature extraction technology and model compensation technology. In this paper, a combined denoising method is proposed to further improve the robustness of the system with the first two techniques.
First, by analyzing the advantages and disadvantages of hard threshold function, soft threshold function, soft and hard threshold function and Garrote threshold function, an improved threshold function is constructed in speech enhancement technology. This function has the advantages of several functions at the same time. Then the feasibility and effectiveness of the function are verified through Matlab simulation.
Secondly, the MFCC parameter and the MFCC parameter based on the wavelet multiresolution analysis are usually adopted in the anti noise feature extraction. Because the FFT transform in the MFCC parameter extraction process is not changed in the time domain and frequency domain analysis window, it violates the characteristics of the nonstationary of the speech signal, and the MFCC parameters based on the wavelet multi-resolution analysis are only divided. In this paper, an improved feature extraction method based on wavelet packet analysis is given in this paper, and it is proved that this method has good recognition results.
Finally, in the part of performance analysis, firstly, based on the combined denoising method of this paper, a speech recognition system of non specific person, isolated word and small vocabulary is constructed. Then, in several different SNR environments, the effectiveness of the combined denoising method is verified by comparing the recognition rate of different systems.
【学位授予单位】:中国矿业大学
【学位级别】:硕士
【学位授予年份】:2014
【分类号】:TN912.34
【参考文献】
相关博士学位论文 前2条
1 吕钊;噪声环境下的语音识别算法研究[D];安徽大学;2011年
2 马龙华;车载环境下语音识别方法研究[D];哈尔滨工程大学;2008年
本文编号:1957021
本文链接:https://www.wllwen.com/kejilunwen/wltx/1957021.html