复杂环境下阵列语音识别方法的研究
发布时间:2018-06-19 11:55
本文选题:麦克风阵列 + 语音识别 ; 参考:《辽宁工业大学》2014年硕士论文
【摘要】:语音识别属于人工智能和语音处理领域,它是让机器听懂人类的语言,并按照人的命令去执行相应的操作。目前单通道语音识别发展迅速,识别效果较好。然而,存在灵活性差、需要佩戴麦克风、限制说话人活动等缺点。麦克风阵列正好能克服上述单通道语音识别的缺点,,因此,近几年麦克风阵列语音识别逐渐成为研究热点。 论文在综述国内外语音识别技术研究进展的基础上,系统分析了目前语音识别存在的问题;阐述了语音信号预处理的理论基础,包括采样量化、分帧加窗、端点检测等;详细分析了特征参数提取常用的参数梅尔倒谱系数;研究了HMM模型的三个基础算法以及语音识别中基元的选择和状态数的确定;给出了HMM模型在应用中存在的问题及解决办法。 针对单通道语音识别在实际环境中识别效果不理想的问题,论文首先提出一种基于多通道选择的阵列语音识别方法。该方法针对实际封闭环境,构建时延补偿后阵列信号相关矩阵,并对其进行子空间分解。在信号子空间下,采用基于归一化多路互相关系数的通道选择方法,去掉相关性较小的通道、选择互相关系数最大的通道组成新麦克风阵列,进而经过波束形成得到输出信号;最后,通过语音识别器得到识别结果。在此基础上考虑到语音识别不仅是一个信号处理问题,而是一个模型判别问题。因此,阵列波束形成和语音识别联合处理,将语音识别系统中的信息运用到前端的阵列处理中,用共轭梯度算法找到使正确假设似然概率最大的滤波器系数,应用到语音识别器得到识别结果。仿真实验结果表明,这些方法不仅减少了阵元数目,降低了计算量,而且加强了对识别有利的信息,提高了识别率,在复杂声学环境下具有较好的鲁棒性。
[Abstract]:Speech recognition belongs to the field of artificial intelligence and speech processing , it is to let the machine understand human language and carry out corresponding operation according to the human order . At present , the single - channel speech recognition is developed rapidly and the recognition effect is good . However , the microphone array can overcome the disadvantages of single - channel speech recognition , so the speech recognition of microphone array has become a hot spot in recent years .
On the basis of summarizing the research progress of speech recognition at home and abroad , this paper systematically analyzes the existing problems of speech recognition .
The theoretical basis of speech signal preprocessing is described , including sampling quantization , sub - frame windowing , endpoint detection , etc .
The parameter Mel cepstrum coefficient commonly used in extracting characteristic parameters is analyzed in detail .
The three basic algorithms of HMM and the determination of the number of elements in speech recognition are studied .
The problems and solutions of HMM model in application are given .
An array speech recognition method based on multi - channel selection is proposed for single - channel speech recognition in real environment . The method is based on the real - enclosed environment , constructs delay - compensated array signal correlation matrix and subspace decomposition . Under the subspace of signal subspace , the channel selection method based on the normalized multi - channel correlation number is adopted to remove the channel with smaller correlation , and the channel with the largest correlation number is selected to form a new microphone array , and then the output signal is obtained through the beam forming ;
In the end , the recognition result is obtained by the speech recognizer . Based on this , the speech recognition is not only a signal processing problem , but a model discrimination problem . Therefore , the array beam forming and speech recognition combined processing are used to apply the information in the speech recognition system to the array processing of the front end . The result of recognition is obtained by using the conjugate gradient algorithm . The simulation results show that these methods not only reduce the number of elements , reduce the calculation amount , but also enhance the recognition favorable information , improve the recognition rate and have better robustness in the complex acoustic environment .
【学位授予单位】:辽宁工业大学
【学位级别】:硕士
【学位授予年份】:2014
【分类号】:TN912.34
【参考文献】
相关期刊论文 前7条
1 曾孝平;赵铁军;韩庆文;;基于信号噪声子空间的OFDM信道估计[J];重庆大学学报(自然科学版);2007年09期
2 刘超;庄圣贤;;高脉冲噪声坏境中双门限法语音端点检测研究[J];电子科技;2013年04期
3 荣薇;陶智;顾济华;赵鹤鸣;;基于改进LPCC和MFCC的汉语耳语音识别[J];计算机工程与应用;2007年30期
4 何珏;刘加;;汉语连续语音中HMM模型状态数优化方法研究[J];中文信息学报;2006年06期
5 赵贤宇,王作英;用于语音识别的鲁棒自适应麦克风阵列算法[J];清华大学学报(自然科学版);2004年10期
6 刘明宇;高晓晶;;ANN/HMM混合模型中状态数的自适应确定方法研究[J];自动化技术与应用;2009年07期
7 张奕;殷福亮;;混响和空间噪声环境下的鲁棒时延估计方法[J];信号处理;2009年08期
本文编号:2039801
本文链接:https://www.wllwen.com/kejilunwen/wltx/2039801.html