基于非特定人的语音识别前端处理技术的研究

发布时间：2018-07-17 03:06

【摘要】：近年来,随着人工智能的不断发展,语音识别技术,已经逐渐从研究阶段进入到实际应用阶段,是一项潜在研究价值较高的技术。但是,在语音识别系统的研究中,如何优化系统性能,仍然是现在讨论的焦点。文中详细介绍了整个系统的基本构成及其原理,对语音识别系统的个别关键技术进行了深入研究,并提出了相应的改进算法。语音识别大致流程包括:语音端点检测、特征参数的提取以及语音模型的训练与识别算法。首先,本文对语音识别系统的部分关键技术,包括语音信号的预处理、端点检测以及特征提取算法,进行了深入的研究。在低信噪比噪声环境下,对信号的端点检测和基音周期提取这两个关键技术提出了相应的改进算法,分别是:基于经验模式分解(EMD)和改进小波熵的端点检测算法和一种基于小波包变换加权自相关的基音周期提取算法,并与原始算法相比较。其次,本文选取Mel倒谱系数为特征参数,并仔细研究了MFCC特征参数的提取过程,提出了一种基于小波包变换的抗噪语音特征参数-WPTMFCC特征参数。实验表明,新的特征参数能提高系统的鲁棒性,在不同信噪比噪声环境下识别率相比传统LPCC特征参数和MFCC特征参数分都有所提高。本文在MATLAB平台上搭建了一个基于隐马尔科夫模型(HMM)的识别系统。通过对比仿真实验,证明了改进的端点检测技术和WPTMFCC特征参数能提高系统的识别率。最后,设计出识别系统的GUI界面,通过此界面可以对语音库中的语音进行实时识别演示。
[Abstract]:In recent years, with the continuous development of artificial intelligence, speech recognition technology has gradually moved from the research stage to the practical application stage, is a potential research value of the technology. However, in the research of speech recognition system, how to optimize the system performance is still the focus of discussion. In this paper, the basic structure and principle of the whole system are introduced in detail, some key technologies of speech recognition system are deeply studied, and corresponding improved algorithms are put forward. The general flow of speech recognition includes speech endpoint detection, feature parameter extraction, speech model training and recognition algorithm. Firstly, some key technologies of speech recognition system, including speech signal preprocessing, endpoint detection and feature extraction algorithm, are studied in this paper. In the environment of low SNR noise, two key techniques of signal endpoint detection and pitch period extraction are proposed. They are: an endpoint detection algorithm based on empirical mode decomposition (EMD) and improved wavelet entropy and an algorithm of pitch period extraction based on wavelet packet transform weighted autocorrelation and compared with the original algorithm. Secondly, the Mel cepstrum coefficient is selected as the feature parameter, and the extraction process of MFCC feature parameter is studied carefully, and a feature parameter -WPTMFCC for anti-noise speech based on wavelet packet transform is proposed. The experimental results show that the new feature parameters can improve the robustness of the system, and the recognition rate in different SNR noise environments is higher than that of the traditional LPCC feature parameters and MFCC feature parameters. In this paper, a recognition system based on Hidden Markov Model (hmm) is built on MATLAB platform. The simulation results show that the improved endpoint detection technique and the characteristic parameters of WPTMFCC can improve the recognition rate of the system. Finally, the GUI interface of the recognition system is designed, through which the speech in the speech database can be recognized and demonstrated in real time.
【学位授予单位】：安徽工业大学
【学位级别】：硕士
【学位授予年份】：2017
【分类号】：TN912.34

【参考文献】