基于移动终端的声纹识别系统关键算法研究

发布时间：2018-08-09 08:14

【摘要】：声纹识别技术是一种生物认证方法,它从说话人的语音中提取出能反映说话人生理和行为个性的特征,再结合模式识别的理论,来判断说话人身份。本文主要针对基于移动终端的声纹识别系统的相关技术进行了研究。在语音端点检测方面,本文提出了改进的能量-过零率两级融合端点检测法,该方法与传统的能量-过零率端点检测法不同,它可以将能量检测和过零检测分开操作,使这两种检测的结果同时进行又互不影响,从而实现多线程并行计算。此外,改进的能量-过零率端点检测法在检测中运用的是单门限,相对于传统算法,改进算法可将阈值参数减少一半,使算法过程更加简单。针对空间资源有限的移动终端,本文将改进算法与常用的单阈值能量检测法进行对比,发现运用改进算法的声纹识别系统的识别率更高。因此,改进的能量-过零率两级融合端点检测法在移动终端上具有很高的应用价值。针对传统语音帧投票法无法突出每一帧语音判决结果的差异性的问题,本文提出了基于似然概率的的加权投票法。此方法根据不同语音帧与概率模型之间的似然概率取值,对每一帧语音进行加权,使得似然概率大的语音帧权重更大,置信度更高,从而增强每帧语音判决结果之间的差异,使语音帧融合结果更准确。同时,通过多次的加权检测,本文验证了基于加权投票法的声纹识别系统比基于传统投票法的识别系统识别性能更优。最后,本文设计了多种特征提取技术以及概率模型的组合方案,通过实际识别效果和算法复杂度的角度来分析它们在移动终端上的可行性,选出最可行的方案。并且根据最优的声纹识别系统方案,设计了一种基于移动终端的声纹识别系统,并在MATLAB平台上实现了该系统,该系统可实现声纹采集、模型训练、声纹识别、声纹注册、声纹确认等功能。目前,该系统已经成功移植于Android系统当中。
[Abstract]:Voiceprint recognition is a biometric authentication method, which extracts the characteristics that reflect the speaker's physiological and behavioral personality from the speaker's speech, and then combines the theory of pattern recognition to judge the speaker's identity. This paper mainly focuses on the related technology of voiceprint recognition system based on mobile terminal. In the aspect of speech endpoint detection, this paper presents an improved two-stage fusion endpoint detection method with energy-zero crossing rate. This method is different from the traditional energy-zero-crossing rate endpoint detection method, and it can separate energy detection from zero-crossing detection. The results of these two kinds of detection are carried out simultaneously without affecting each other, so that multithreaded parallel computing is realized. In addition, the improved energy-zero crossing rate endpoint detection method uses a single threshold, compared with the traditional algorithm, the improved algorithm can reduce the threshold parameter by half, and make the algorithm more simple. For mobile terminals with limited space resources, the improved algorithm is compared with the conventional single threshold energy detection method. It is found that the recognition rate of the voiceprint recognition system using the improved algorithm is higher than that of the conventional single threshold energy detection method. Therefore, the improved energy-zero-crossing two-stage fusion endpoint detection method has high application value in mobile terminal. Aiming at the problem that the traditional voice frame voting method can not highlight the difference of the result of each frame, a weighted voting method based on likelihood probability is proposed in this paper. According to the likelihood probability of different speech frames and probabilistic models, each frame is weighted by this method, which makes the speech frames with large likelihood probability have greater weight and higher confidence, thus enhancing the difference between the results of speech judgment in each frame. The result of speech frame fusion is more accurate. At the same time, through multiple weighted detection, this paper verifies that the voice-pattern recognition system based on weighted voting method is better than that based on traditional voting method. Finally, this paper designs a variety of feature extraction techniques and probability model combination scheme, through the actual recognition effect and algorithm complexity to analyze their feasibility on the mobile terminal, select the most feasible scheme. According to the optimal scheme of voiceprint recognition system, a voiceprint recognition system based on mobile terminal is designed, and the system is implemented on MATLAB platform. The system can realize voice pattern acquisition, model training, voiceprint recognition and registration. Voiceprint confirmation and other functions. At present, the system has been successfully transplanted to the Android system.
【学位授予单位】：上海师范大学
【学位级别】：硕士
【学位授予年份】：2017
【分类号】：TN912.34

【参考文献】