重录语音检测算法
发布时间:2018-08-08 12:23
【摘要】:非法认证者可通过播放重新录制合法认证者的语音欺骗说话人识别系统以获得进入系统的权限,为社会安全带来威胁。因此,重录语音的检测具有现实的紧迫性,但相关的研究报道仍较缺乏。为此,本文提出一种重录语音的检测算法。该算法以MFCC(Mel-Frequency Cepstral Coefficients,美尔频率倒谱系数)的统计量作为SVM(Support Vector Machine,支持向量机)和KNN(K-Nearest Neighbors,K最近邻)分类方法的特征;除以上两种分类方法外,本文亦考察使用SAE(Sparse Autoencoder,稀疏自动编码器)的检测性能。为模拟现实生活中重录语音的真实情景,本文实验通过不同的录音设备、录音距离及录音环境对算法进行全面的测试。实验结果表明,通过增加重录语音的多样性作为训练可以使该算法的正确率提高到99.67%,达到了较好的检测性能。
[Abstract]:Illegal authenticators can rerecord the legitimate authenticator's voice to deceive the speaker identification system to gain access to the system, which brings a threat to social security. Therefore, the detection of rerecord speech is urgent, but the related research reports are still lacking. Therefore, this paper proposes a detection algorithm for rerecord speech. In this algorithm, the statistics of MFCC (Mel-Frequency Cepstral efficient number) and KNN (K-Nearest neighbor) are used as the features of the SVM (Support Vector Machine, support vector machine and the KNN (K-Nearest neighbor) classification method, in addition to the above two classification methods, This paper also investigates the detection performance using SAE (Sparse Autoencoder, sparse automatic encoder. In order to simulate the real situation of the rerecorded voice in real life, the algorithm is tested by different recording equipment, recording distance and recording environment. The experimental results show that the accuracy of the algorithm can be improved to 99.67 by increasing the diversity of the rerecorded speech, and the detection performance is better.
【作者单位】: 五邑大学信息工程学院;广东技术师范学院电子与信息学院;
【基金】:国家自然科学基金(61672173,61372193,61072127) 国家自然科学基金(青年科学基金)(61100168) 广东省自然科学基金(S2013010013311,2014A030313623) 广东省普通高校特色创新项目(2015KTSCX083)
【分类号】:TN912.3
,
本文编号:2171778
[Abstract]:Illegal authenticators can rerecord the legitimate authenticator's voice to deceive the speaker identification system to gain access to the system, which brings a threat to social security. Therefore, the detection of rerecord speech is urgent, but the related research reports are still lacking. Therefore, this paper proposes a detection algorithm for rerecord speech. In this algorithm, the statistics of MFCC (Mel-Frequency Cepstral efficient number) and KNN (K-Nearest neighbor) are used as the features of the SVM (Support Vector Machine, support vector machine and the KNN (K-Nearest neighbor) classification method, in addition to the above two classification methods, This paper also investigates the detection performance using SAE (Sparse Autoencoder, sparse automatic encoder. In order to simulate the real situation of the rerecorded voice in real life, the algorithm is tested by different recording equipment, recording distance and recording environment. The experimental results show that the accuracy of the algorithm can be improved to 99.67 by increasing the diversity of the rerecorded speech, and the detection performance is better.
【作者单位】: 五邑大学信息工程学院;广东技术师范学院电子与信息学院;
【基金】:国家自然科学基金(61672173,61372193,61072127) 国家自然科学基金(青年科学基金)(61100168) 广东省自然科学基金(S2013010013311,2014A030313623) 广东省普通高校特色创新项目(2015KTSCX083)
【分类号】:TN912.3
,
本文编号:2171778
本文链接:https://www.wllwen.com/kejilunwen/xinxigongchenglunwen/2171778.html