基于计算听觉场景分析的单信道语言分离

发布时间：2018-06-23 15:23

本文选题：盲源分离 + 单信道语音分离　；参考：《北京交通大学》2014年博士论文

【摘要】：摘要：单信道语音分离(Single-channel speech separation, SCSS)指在无法预知声源先验信息的情况下,仅根据观测到的单路混合信号恢复原声源的过程。计算听觉场景分析(Computational auditory scene analysis, CASA)是解决该问题的一种新方法。它通过寻找语音中感知相关的区分性特征实现语音分离,并避免了对噪声特性的过多假设。当前,CASA的研究主要集中于两个方向：1)数据驱动型CASA;2)基于模型的CASA。前者主要对应于生物快速的、本能的条件反射；而后者主要针对相对缓慢的、高层的推理过程。在应对复杂声场景时,生物所具备的迅速反应能力预示着声源分离的工作很大程度是在底层完成的。有鉴于此,本文对数据驱动型CASA进行了较为深入的研究,其中主要的工作和贡献如下： 1.针对短时幅度调制谱(Amplitude modulation spectrum, AMS)分辨率低的特点,提出了一种重分配(reassignment)策略的双话者(Co-channel)语音分离算法。该算法通过可变截止频率的低通滤波器抽取出依子带变化的幅度调制信号(Amplitude Modulation, AM);接着,将抽取出的AM信号谱(spectrum)上的每一能量点重新放置,有效实现了信号成分的会聚,并缓解了时间分辨率和频率分辨率的矛盾。实验结果表明,基于重分配AMS的语音分离方法具有明显改善的性能。 2.受Schroeder直方图、Goldstein听觉感知理论以及Meddis'‘相关图”(Correlo-gram)的启发,提出了一种基于“高斯图”(Gaussgram)的多基音(multi-pitch)检测算法。“高斯图”通过采用可变带宽的高斯函数修正“相关图”得到,具有抑制次谐波(sub-harmonics)的特点。将其用于检测基音,单帧基音检测的半频错误明显减少。另一方面,该方法采用检测得到的主基音轨迹消除其次谐波轨迹,进一步抑制了半频错误。系统评估表明,提出的多基音检测算法具有更少的倍／半频错误。 3.提出了一种多层感知器的量化门限自适应新方法,从而给出一种改进的多层感知器(Multi-layer perceptron,MLP)。将该MLP嵌入CASA计算框架,可以提高系统在训练和测试信噪比(Signal-to-noise ratio, SNR)不匹配条件下的鲁棒性,减少性能的下滑。对比实验表明,该方法可以改善分离系统在不同SNR下的性能。
[Abstract]:Absrtact: Single-channel speech separation (SCSs) refers to the process of recovering the original sound source only according to the observed single-channel mixed signal when the prior information of the sound source cannot be predicted. Computational auditory scene analysis, analysis is a new method to solve this problem. It realizes speech separation by looking for perceptual related distinguishing features in speech and avoids too many assumptions about noise characteristics. The current research on CASA is mainly focused on two directions: 1) Data-driven CASASA2) Model-based CASAA. The former mainly corresponds to the biological quick, instinctive conditioned reflex, while the latter is mainly aimed at the relatively slow, high-level reasoning process. When dealing with complex sound scenes, the rapid response ability of organisms indicates that the separation of sound sources is largely done at the bottom. In this paper, the data driven CASA is studied in depth. The main work and contributions are as follows: 1. Aiming at the low resolution of short time amplitude Modulation Spectral (Amplitude modulation spectrum, AMS), a Co-channel speech separation algorithm based on reallocation of (reassignment) strategy is proposed. In this algorithm, the Amplitude modulation (AM) signal is extracted by low-pass filter with variable cutoff frequency, and then every energy point on the extracted AM signal spectrum (spectrum) is repositioned to realize the convergence of the signal components. The contradiction between time resolution and frequency resolution is alleviated. The experimental results show that the speech separation method based on rescheduled AMS has significantly improved performance. 2. Inspired by the Schroeder histogram Goldstein auditory perception theory and the Correlo-gram, a multi-pitch detection algorithm based on Gao Si graph is proposed. The "Gao Si diagram" is obtained by modifying the "correlation diagram" by using the Gao Si function with variable bandwidth, which is characterized by subharmonic suppression (sub-harmonics). When used to detect pitch, the half-frequency error of single-frame pitch detection is obviously reduced. On the other hand, the detected principal pitch track is used to eliminate the second harmonic track and further suppress the half-frequency error. The system evaluation shows that the proposed multi-pitch detection algorithm has less multiple / half frequency errors. A new quantization threshold adaptive method for multilayer perceptron is proposed, and an improved multi-layer perceptron (MLP) is presented. By embedding the MLP into CASA framework, the robustness of the system can be improved under the condition of signal to noise ratio (SNR) mismatch, and the performance decline can be reduced. The experimental results show that the proposed method can improve the performance of the separation system under different SNR conditions.
【学位授予单位】：北京交通大学
【学位级别】：博士
【学位授予年份】：2014
【分类号】：TN912.3

【共引文献】