基于文本无关的声纹识别算法的研究及实现

发布时间：2018-12-20 06:32

【摘要】：随着互联网技术的迅猛发展,网络逐渐覆盖到了社会生活的各个角落。在互联网环境中,传统的身份认证方法面临巨大的挑战,越来越无法适应实际应用环境的需求。在所有的身份认证方法中,生物特征身份识别技术是一种基于人类特有的生理和后天特性进行的身份识别技术,因其独特的优势而在实际中得到了广泛的应用。在所有生物特征身份识别技术中,与文本无关的声纹确认技术被认为是最具实用性的生物特征身份识别技术之一,该技术通过目标说话人的语音对说话人的身份进行确认,是语音识别研究的一个重要分支。在实际应用环境中,受到采集设备、传输线路等多种因素的影响,最终得到的有效语音数据非常有限,进而使得系统的识别性能和执行效率很难达到理想的识别效果。因此,本文主要基于文本无关的短语音声纹确认方法进行研究。在声纹确认系统中,系统的识别率和计算复杂度是衡量系统性能的重要指标。传统的UBM-MAP-GMM模型架构在一定程度上解决了测试语音与训练语音失配的情况,系统识别性能也比较理想,然而在实际应用中,面对短语音问题,该模型的运算量需求较大,系统鲁棒性较差。因此,本文从减少系统计算量、提高识别率等多个角度出发对声纹识别算法进行了研究,具体有以下几个方面:1.分析了模型训练中模型初始值对EM算法的影响,针对传统K-means算法随机选择初始聚类中心可能导致算法局部收敛的缺陷,提出了基于密度和距离的初始聚类中心选择算法,对K-means算法进行了改进,并且通过实验证明了算法。2.探讨分析了UBM-MAP-GMM模型架构,针对其计算量大、个人声纹模型GMM服从同一模型结构及部分高斯分量对识别结果的影响,提出了基于UBM-CM-MAP-GMM模型架构的声纹确认方法。实验证明,该方法使得算法在识别时间、等错误率方面都有一定的改善。3.在UBM-CM-MAP-GMM模型架构中,对声纹模型GMM的混合度的取值进行研究,实验数据显示当GMM混合度为UBM的一半时效果最好。4.在UBM-CM-MAP-GMM模型架构上实现了短语音声纹确认软件,并对软件的识别效率进行了实验分析与验证,相比于传统的UBM-MAP-GMM模型架构,改进算法使得计算量和等错误率都一定程度的降低。
[Abstract]:With the rapid development of Internet technology, the network gradually covers every corner of social life. In the Internet environment, the traditional identity authentication method is facing a huge challenge, which is more and more unable to meet the needs of the practical application environment. Among all the authentication methods, biometric identification technology is a kind of identity recognition technology based on human physiological and acquired characteristics, which has been widely used in practice because of its unique advantages. Among all biometric identification techniques, text-independent voiceprint recognition is considered to be one of the most practical biometric identification techniques. It is an important branch of speech recognition. In the practical application environment, due to the influence of many factors, such as acquisition equipment, transmission line, and so on, the final effective speech data is very limited, which makes the recognition performance and execution efficiency of the system difficult to achieve the ideal recognition effect. Therefore, this paper is mainly based on the text-independent phonetics validation method. The recognition rate and computational complexity of the system are important indexes to evaluate the system performance in the voiceprint verification system. The traditional UBM-MAP-GMM model structure solves the mismatch between the test speech and the trained speech to a certain extent, and the recognition performance of the system is also ideal. However, in the practical application, in the face of the short speech problem, the model requires a lot of computation. System robustness is poor. Therefore, this paper studies the voiceprint recognition algorithm from several angles, such as reducing the system computation and improving the recognition rate. The main contents are as follows: 1. This paper analyzes the influence of the initial value of the model on the EM algorithm in model training, aiming at the defect that the traditional K-means algorithm randomly selects the initial clustering center, which may lead to the local convergence of the algorithm, an initial clustering center selection algorithm based on density and distance is proposed. The K-means algorithm is improved, and the algorithm is proved by experiment. 2. 2. The structure of UBM-MAP-GMM model is discussed and analyzed. According to the large amount of calculation, the influence of individual voice-pattern model GMM service from the same model structure and part of Gao Si component on the recognition result is discussed. A voiceprint validation method based on UBM-CM-MAP-GMM model architecture is proposed. Experiments show that the algorithm can improve the recognition time and error rate of the algorithm. In the framework of UBM-CM-MAP-GMM model, the mixing degree of the voiceprint model GMM is studied. The experimental data show that the best result is when the mixing degree of GMM is half that of UBM. 4. In this paper, the phonetics validation software is implemented on the UBM-CM-MAP-GMM model architecture, and the recognition efficiency of the software is analyzed and verified experimentally. Compared with the traditional UBM-MAP-GMM model architecture, the recognition efficiency of the software is compared with that of the traditional UBM-MAP-GMM model. The improved algorithm reduces the amount of computation and the rate of equal error to a certain extent.
【学位授予单位】：电子科技大学
【学位级别】：硕士
【学位授予年份】：2017
【分类号】：TN912.3

【相似文献】