一种噪声环境下的复杂声音识别方法

发布时间：2018-10-26 19:27

【摘要】：当今社会已进入人工智能的时代,语音识别技术已经相当成熟。而对于实际生活中的复杂声音,由于其声源的复杂性和多样性,加之背景噪声的干扰,目前对于这一领域的识别研究还远远不够成熟,仍然存在许多问题和缺陷。因此对噪声环境下复杂声音的识别研究具有非常重大的实践价值和理论价值。复杂声音是指这样一类包含多种声音类型且这些声音之间的边界难以区分的声音信号。目前对于这类声音的检测方法主要沿用传统的语音识别技术,语音信号发音方式较为固定且能量平稳,而复杂声音种类繁多,发音原理各不相同,瞬间能量也较大,而且还会被环境噪音所干扰,因此仅仅采用传统的语音识别技术不能够较好地应用于复杂声音的识别。针对噪声环境下这一类声音识别准确率低的问题,本文主要进行了如下研究工作:(1)首先主要介绍了声音识别中常用的几种时频域特征,通过提取和分析复杂声音样本的特征参数,提出了由时频域特征组合的方式来共同描述复杂声音,并进行了多种混合特征的对比实验。(2)在对噪声环境下的复杂声音识别方法研究过程中,针对人工选择训练样本的困难,提出了一种基于聚类标注的训练样本选择算法,能够更加快速精准地选择出训练样本代表集,并进行了不同聚类方法的对比实验。(3)最后提出了基于隐马尔可夫模型(Hidden Markov Mode1,HMM)的复杂声音识别框架,并进行了训练和识别。通过对列车声音以及鸟叫声两种不同类型的复杂声音进行仿真实验,结果表明,利用时域特征短时自相关函数以及频域特征梅尔频率倒谱系数组合的混合特征参数表示复杂声音特征,使用本文提出的基于近邻传播聚类标注的训练样本选择算法,以及通过HMM模型识别框架进行建模,可以显著提高噪声环境下复杂声音的识别准确率和效率。
[Abstract]:Nowadays, the society has entered the era of artificial intelligence, speech recognition technology has been quite mature. Because of the complexity and diversity of the sound sources and the interference of background noise, the research on the recognition of complex sound in real life is far from mature, and there are still many problems and defects. Therefore, it is of great practical and theoretical value to study the recognition of complex sound in noisy environment. Complex sound is a kind of sound signal which contains many kinds of sound types and whose boundaries are difficult to distinguish. At present, the detection methods of this kind of sound mainly use the traditional speech recognition technology. The speech signal pronunciation mode is relatively fixed and the energy is stable, and there are many kinds of complex sounds, different pronunciation principles and great instantaneous energy. And it will be interfered by environmental noise, so only traditional speech recognition technology can not be applied to the recognition of complex sound. In order to solve the problem of low accuracy in noise environment, the main work of this paper is as follows: (1) firstly, several time-frequency domain features commonly used in sound recognition are introduced. In the process of studying the method of complex sound recognition in noisy environment, a training sample selection algorithm based on clustering tagging is proposed to overcome the difficulty of manually selecting training samples. The training sample representative set can be selected more quickly and accurately, and the comparison experiments of different clustering methods are carried out. (3) finally, a complex voice recognition framework based on hidden Markov model (Hidden Markov Mode1,HMM) is proposed. Training and recognition are also carried out. The simulation results of two different types of complex sounds, train sounds and bird calls, show that, The time domain feature short time autocorrelation function and the mixed feature parameters of frequency domain feature Mel frequency cepstrum coefficient combination are used to represent the complex sound features, and the training sample selection algorithm based on nearest neighbor propagation clustering is proposed in this paper. The accuracy and efficiency of complex sound recognition in noisy environment can be significantly improved by modeling with HMM model recognition framework.
【学位授予单位】：合肥工业大学
【学位级别】：硕士
【学位授予年份】：2017
【分类号】：TN912.34

【相似文献】