当前位置:主页 > 科技论文 > 信息工程论文 >

基于深度信念网络的说话者识别研究与实现

发布时间:2018-05-20 17:22

  本文选题:说话人识别 + 深度神经网络 ; 参考:《南京邮电大学》2017年硕士论文


【摘要】:随着多媒体信息技术的快速发展,网络语音资源呈现出了爆炸式地增长,因此如何利用语音进行分类和识别具有重要的意义。说话人识别技术可以利用少量声音数据区分说话人,从而实现身份认证的功能,它是语音信号处理中的关键技术。但是传统的说话人识别系统往往还存在学习不充分、网络模型深度不够以及语料数据不充分的情况下识别系统的真实模型往往复杂度不够等情况。本文在分析说话人识别方法优缺点基础上使用深度学习技术设计实现一个说话人识别的系统。本文的主要工作如下:(1)归纳了说话人识别方法和特征提取方式的特点和困难点,对比分析目前常用的各种说话人识别技术策略、模型和算法之间的优缺点。(2)研究了基于深度学习的说话人识别框架。将深度学习理论应用到传统的说话人识别系统,使用受限的玻尔兹曼机和后向传播算法训练深度信念网络,从而克服了直接对多层网络模型进行训练的效率问题。(3)引入信道环境下i-vector分析方法的说话人识别,并在i-vector方法基础上,对传统高斯混合型说话人识别进行改善,提出一种使用无压缩i-vector形式和深度学习相结合的方法。在使用无压缩i-vector形式的深度学习说话人识别方法上测试和传统方法比对识别率的影响;不同性别对识别率的影响。(4)根据说话人识别的处理流程,进而给出基于深度学习说话人识别的系统结构,对其中的核心模块进行了具体设计并予以仿真实现,最后对各类说话人识别系统的性能展开测试并对测试效果分析。
[Abstract]:With the rapid development of multimedia information technology, the network speech resources show explosive growth, so how to use speech classification and recognition has important significance. Speaker recognition is a key technology in speech signal processing, which can distinguish the speaker with a small amount of sound data and realize the function of identity authentication. However, traditional speaker recognition systems often have insufficient learning, insufficient depth of the network model and insufficient corpus data to identify the real model of the system is often not enough complexity and so on. On the basis of analyzing the advantages and disadvantages of speaker recognition methods, this paper designs and implements a speaker recognition system using depth learning technology. The main work of this paper is as follows: (1) the characteristics and difficulties of speaker recognition methods and feature extraction methods are summarized, and various commonly used speaker recognition techniques are compared and analyzed. The advantages and disadvantages between the model and the algorithm. 2) the speaker recognition framework based on deep learning is studied. The depth learning theory is applied to the traditional speaker recognition system. The restricted Boltzmann machine and the backward propagation algorithm are used to train the depth belief network. It overcomes the efficiency problem of training the multilayer network model directly. It introduces the speaker recognition of i-vector analysis method under the channel environment, and improves the traditional Gao Si hybrid speaker recognition based on the i-vector method. This paper presents a method of combining uncompressed i-vector with depth learning. To test and compare the effects of traditional methods on recognition rate in depth learning speaker recognition methods using uncompressed i-vector forms; the effect of gender on recognition rate. 4) according to the processing process of speaker recognition, Furthermore, the structure of speaker recognition system based on depth learning is given, and the core modules are designed and simulated. Finally, the performance of various speaker recognition systems is tested and the test results are analyzed.
【学位授予单位】:南京邮电大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TN912.34

【参考文献】

相关期刊论文 前5条

1 于俊婷;刘伍颖;易绵竹;李雪;李娜;;国内语音识别研究综述[J];计算机光盘软件与应用;2014年10期

2 余凯;贾磊;陈雨强;徐伟;;深度学习的昨天、今天和明天[J];计算机研究与发展;2013年09期

3 禹琳琳;;语音识别技术及应用综述[J];现代电子技术;2013年13期

4 李海峰;李纯果;;深度学习结构和算法比较分析[J];河北大学学报(自然科学版);2012年05期

5 甄斌,吴玺宏,刘志敏,迟惠生;语音识别和说话人识别中各倒谱分量的相对重要性[J];北京大学学报(自然科学版);2001年03期

相关硕士学位论文 前4条

1 耿国胜;基于深度学习的说话人识别技术研究[D];大连理工大学;2014年

2 杨迪;基于多特征决策融合的说话人识别研究[D];华北电力大学;2013年

3 熊华乔;基于模型聚类的说话人识别方法研究[D];武汉理工大学;2012年

4 陆春梅;与文本无关的开集说话人识别技术研究[D];西南交通大学;2011年



本文编号:1915554

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/xinxigongchenglunwen/1915554.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户26ff2***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com