说话人声纹识别的算法研究
发布时间:2018-05-31 23:06
本文选题:说话人识别 + 说话人确认 ; 参考:《浙江大学》2017年硕士论文
【摘要】:说话人声纹识别是以声音作为识别特征的一种身份认证手段,为加快说话人识别在实际商业中的应用,对其技术的研究与实现具有极其深远的意义。与文本无关的说话人确认是说话人识别的研究方向之一。主流算法是基于概率统计模型,在语料充分情况下GMM-UBM(Gaussian Mixture Model-Universal Background Model)模型获得了 较好的性能,但在噪声情况和信道失配下,识别性能难以进一步提升。为此提出了总变化因子(i-vector)分析技术,将长短不一的语音映射到低维矢量,在低维矢量中解决信道问题。LDA(Linear Discriminant Analysis)和 PLDA(Probabilistic Linear Discriminant Analysis)是常用的信道补偿技术,不过后者常被用来作为打分工具。本文以GMM-UBM模型为基础研究框架,并进一步研究了基于I-vector和PLDA模型的说话人确认系统。本文主要研究内容如下:(1)针对说话人识别在云平台中的应用,提出了基于云平台的说话人识别系统框架。分析了语音预处理过程和基于人耳听觉感知的梅尔倒谱系数MFCC的特征提取流程。(2)构建了基于GMM-UBM模型的说话人识别系统。详细介绍了 UBM模型的训练过程和MAP自适应匹配过程。设置实验数据库,探究了 UBM训练说话人个数、模型高斯元件数、训练语音长度、测试语音长度、MFCC特征维数等因素对系统性能的影响。(3)构建了基于I-vector和PLDA模型的说话人确认系统,对I-vector提取算法和PLDA模型进行了分析。实验对比不同系统的性能,并探究了 norm变换、I-vector特征维度、PLDA因子维度等因素对系统性能的影响。(4)结合LDA和WCCN规整技术对I-vector进行信道补偿和降维,并深入分析了该技术对实验结果的影响。针对LDA分类性能不显著问题,提出改进的分类算法,并进行实验验证。
[Abstract]:Speaker voice-pattern recognition is a means of identity authentication with voice as the recognition feature. In order to speed up the application of speaker recognition in practical business, the research and implementation of its technology is of great significance. Text independent speaker recognition is one of the research directions of speaker recognition. The mainstream algorithm is based on probabilistic statistical model, and the performance of GMM-UBM(Gaussian Mixture Model-Universal Background Model is better in the case of sufficient corpus, but it is difficult to improve the performance of recognition in the case of noise and channel mismatch. In order to solve the channel problem in low dimensional vector, the technique of total change factor i-vector-based analysis and PLDA(Probabilistic Linear Discriminant Analysis) are commonly used channel compensation techniques, in which the speech with different length and length are mapped to the low dimensional vector, and the channel problem is solved by LDAN linear Discriminant analysis (LDAN linear Discriminant analysis) and PLDA(Probabilistic Linear Discriminant Analysis). But the latter are often used as scoring tools. Based on the GMM-UBM model, this paper further studies the speaker confirmation system based on I-vector and PLDA models. The main contents of this paper are as follows: (1) aiming at the application of speaker recognition in cloud platform, a framework of speaker recognition system based on cloud platform is proposed. The speech preprocessing process and the feature extraction process of Mel cepstrum coefficient (MFCC) based on human auditory perception are analyzed. A speaker recognition system based on GMM-UBM model is constructed. The training process of UBM model and the process of MAP adaptive matching are introduced in detail. The experiment database is set up to explore the number of speakers trained by UBM, the number of Gao Si components in model, the length of speech training, The speaker confirmation system based on I-vector and PLDA model is constructed, and the I-vector extraction algorithm and PLDA model are analyzed. The performance of different systems is compared, and the influence of factors such as norm transform I-vector feature dimension and PLDA factor dimension on system performance is explored. The channel compensation and dimensionality reduction of I-vector are combined with LDA and WCCN regularization technology. The effect of the technique on the experimental results is also analyzed. In order to solve the problem that the classification performance of LDA is not significant, an improved classification algorithm is proposed and verified by experiments.
【学位授予单位】:浙江大学
【学位级别】:硕士
【学位授予年份】:2017
【分类号】:TN912.34
【参考文献】
相关期刊论文 前10条
1 王威;胡桂明;杨丽;黄东芳;周杨;;基于谱减法和均匀子带频带方差法的端点检测[J];电声技术;2016年05期
2 董胡;;低信噪比环境下改进的语音端点检测算法[J];计算机技术与发展;2016年03期
3 孙一鸣;吴杨扬;李平;;基于改进双门限法的语音端点检测研究[J];长春理工大学学报(自然科学版);2016年01期
4 陈晨;韩纪庆;;说话人识别方法综述[J];智能计算机与应用;2015年05期
5 李琳;万丽虹;洪青阳;张君;李明;;基于概率修正PLDA的说话人识别系统[J];天津大学学报(自然科学与工程技术版);2015年08期
6 邢玉娟;潘颖;曹晓丽;;改进i-向量说话人识别算法研究[J];科学技术与工程;2014年34期
7 周国鑫;高勇;;基于GMM-UBM模型的说话人辨识研究[J];无线电工程;2014年12期
8 李铁军;苗宁;王娟;耿yN明;;云技术平台应用研究[J];信息系统工程;2014年09期
9 许云飞;杨海;周若华;颜永红;;高斯PLDA在说话人确认中的应用及其联合估计[J];自动化学报;2014年06期
10 酆勇;李宓;李子明;;文本无关的说话人识别研究[J];数字通信;2013年04期
,本文编号:1961701
本文链接:https://www.wllwen.com/kejilunwen/xinxigongchenglunwen/1961701.html