基于深度学习的单通道语音分离

发布时间：2018-08-25 18:30

【摘要】：语音分离包括人声与人声的分离、人声与噪声的分离,本文主要的研究工作是人声与噪声的分离,也称为语音增强。随着人工智能的日益发展,语音交互技术在现实生活的应用日益广泛,但是噪声的干扰往往会严重降低语音交互性能,因此语音和噪声的分离工作就显得尤为重要,另外由于很多语音交互的场景是基于单麦克风的,所以近年来基于单麦克风的语音分离技术受到越来越多很多科研人员的关注。传统单通道语音分离算法可分为基于无监督的单通道语音分离和基于有监督的单通道语音分离两大类。基于无监督的单通道语音分离技术大多基于数字信号处理技术,如谱减法、维纳滤波等。传统基于有监督的语音分离算法比较常用的有:基于浅层人工神经网络的语音分离、基于非负矩阵分解(NMF)的语音分离和基于隐马尔可夫模型(HMM)的语音分离。近年来,随着深度神经网络(DNN)技术的不断发展,基于DNN的单通道语音分离技术取得了很大进展。DNN强大的非线性建模能力使得基于DNN的语音分离能取得很好的分离效果。基于DNN的单通道语音分离逐渐成为语音分离任务中一个新的发展趋势。本文首先分析了传统语音分离算法和基于DNN的语音分离算法的优缺点,然后提出了两种改进算法:(1)基于DNN和非负矩阵分解(NMF)的联合优化模型。(2)基于DNN和卷积非负矩阵分解(CNMF)的联合优化模型。最后通过一系列实验证明了算法的有效性。
[Abstract]:Speech separation includes the separation of voice and voice, the separation of voice and noise. The main research work in this paper is the separation of voice and noise, also known as speech enhancement. With the development of artificial intelligence, speech interaction technology is widely used in real life, but the noise interference often seriously reduces the interactive performance of speech, so the separation of speech and noise is particularly important. In addition, because many scenes of speech interaction are based on single microphone, the technology of speech separation based on single microphone has been paid more and more attention by many researchers in recent years. Traditional single-channel speech separation algorithms can be divided into two categories: unsupervised single-channel speech separation and supervised single-channel speech separation. Unsupervised single channel speech separation techniques are mostly based on digital signal processing techniques, such as spectral subtraction, Wiener filtering and so on. The traditional speech separation algorithms based on supervised neural network are as follows: speech separation based on shallow artificial neural network, speech separation based on non-negative matrix decomposition (NMF) and speech separation based on hidden Markov model (HMM). In recent years, with the development of deep neural network (DNN) technology, the single-channel speech separation technology based on DNN has made great progress. Single channel speech separation based on DNN is becoming a new trend in speech separation task. This paper first analyzes the advantages and disadvantages of the traditional speech separation algorithm and the speech separation algorithm based on DNN. Then two improved algorithms are proposed: (1) a joint optimization model based on DNN and nonnegative matrix factorization (NMF) and (2) a joint optimization model based on DNN and convolution nonnegative matrix factorization (CNMF). Finally, the validity of the algorithm is proved by a series of experiments.
【学位授予单位】：内蒙古大学
【学位级别】：硕士
【学位授予年份】：2017
【分类号】：TN912.3

【相似文献】