当前位置:主页 > 科技论文 > 网络通信论文 >

基于深度学习的说话人识别技术研究

发布时间:2018-03-20 23:53

  本文选题:说话人识别 切入点:深度学习 出处:《大连理工大学》2014年硕士论文 论文类型:学位论文


【摘要】:说话人识别通常称为声纹识别,是一种身份认证技术。它具有用户接受度高、所需设备成本低、可扩展性好以及便于移植等优势,可广泛应用于国防军事、银行系统、通信、互联网、公安司法等领域。说话人识别技术已经取得重要进展,并有产品问世,但尚有许多问题有待深入研究。 深度学习是近年来发展起来的一种神经网络模型,它具有克服学习不充分、深度不足等特点,可用于模式分类、目标跟踪等领域。本文将深度学习理论用于说话人识别中,从基于深度学习的说话人识别系统、改进特征的说话人识别算法、改进统计准则的说话人识别算法三个方面,对说话人识别技术进行了研究,主要工作如下: (1)基于深度学习的说话人识别系统的性能研究。将深度学习理论引入到说话人识别系统中,在此基础上分析了测试语音不同单位长度对说话人识别率的影响;在相同测试条件下,不同语音特征参数对说话人识别准确性的影响;在相同条件下,不同的深度学习层数以及层上节点数对于系统识别率的影响,证明了深度学习在说话人识别系统中应用的正确性与可靠性。 (2)基于改进特征的说话人识别算法。本文将模拟人耳听觉特性的MFCC与GFCC语音特征参数结合起来,组成语音特征向量,并应用于说话人识别系统中,提高了系统识别率。 (3)基于改进统计准则的说话人识别算法。考虑到传统的系统统计识别算法对于多个说话人识别时存在潜在的误判,本文应用分帧概率打分的统计准则,并进行了说话人识别实验。实验仿真验证了改进统计准则的可行性与有效性。
[Abstract]:Speaker recognition is usually called voiceprint recognition, which is a kind of identity authentication technology. It has the advantages of high user acceptance, low equipment cost, good expansibility and easy to transplant. It can be widely used in defense, military, banking system, communication, etc. The technology of speaker recognition has made important progress in the fields of Internet, public security and judicature, and some products have been produced, but there are still many problems to be studied deeply. Depth learning is a kind of neural network model developed in recent years. It can be used in pattern classification, target tracking and other fields such as pattern classification, target tracking and so on. In this paper, the speaker recognition technology is studied from three aspects: the speaker recognition system based on in-depth learning, the improved speaker recognition algorithm based on improved features, and the speaker recognition algorithm based on improved statistical criteria. The main work is as follows:. 1) Research on the performance of speaker recognition system based on deep learning. The depth learning theory is introduced into speaker recognition system, and the influence of different unit length of speech on speaker recognition rate is analyzed. Under the same test conditions, the influence of different speech feature parameters on the speaker recognition accuracy, and the effect of different depth learning layers and the number of upper segment points on the recognition rate of the system under the same conditions, The correctness and reliability of depth learning in speaker recognition system are proved. In this paper, we combine MFCC with GFCC speech feature parameters to form speech feature vector, and apply it to speaker recognition system to improve the recognition rate. (3) A speaker recognition algorithm based on improved statistical criteria. Considering the potential misjudgment of traditional statistical recognition algorithm for multiple speakers, this paper applies the statistical criterion of framing probability scoring. A speaker recognition experiment is carried out and the simulation results show that the improved statistical criterion is feasible and effective.
【学位授予单位】:大连理工大学
【学位级别】:硕士
【学位授予年份】:2014
【分类号】:TN912.3

【引证文献】

相关硕士学位论文 前1条

1 吕超;声源辨别及定位的并行化方法的研究与实现[D];江苏科技大学;2016年



本文编号:1641354

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/wltx/1641354.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户66c58***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com