当前位置:主页 > 科技论文 > 网络通信论文 >

基于自适应神经模糊推理与隐马尔可夫的语音分割研究

发布时间:2018-12-27 09:00
【摘要】:现代语音技术和研究需要高精确度和高可靠性的语音分割。人工分割一直被认为是最为可靠和精确的方法。然而,人工分割方法不仅费时费力,还必须由语音专家来进行实施。在大数据时代,尤其针对大型语音库,这是一个致命的缺陷。因此,发展高精确度的自动语音分割技术,是十分必要的。最主要的自动语音分割技术,被称为强制校准。在此方法中,隐马尔可夫模型(HMM)被用于构建不同音素的语音模型。而语音信号被提取为一帧一组的特征向量。该模型可以得到音素间大概的语音边界,但结果不够准确。传统的基于隐马尔科夫模型的强制校准系统,在TIMIT语音库中,以20毫秒的容忍度来计算,精确度在80%?89%之间。迄今为止,许多方法被提出,用于改善基于隐马尔科夫的自动语音分割技术。一些研究人员认识到,基于隐马尔科夫的自动语音分割与人工语音分割之间的差别,是语音专家具有语音分割的相关知识。而模糊逻辑可以将此类知识,直观的转化为可用于计算机的模糊规则。但模糊规则需要专家精心设计,且无法保证规则的完备性。针对这些问题,提出一种更加合适的改善方法,是本研究的目的。自适应神经模糊推理系统(ANFIS)是一种结合神经网络与模糊推理系统的机器学习方法。与其他机器学习方法相比,它具有神经网络和模糊推理系统的优点,且具有较好的性能。其优点:实现简单,非线性,使用模糊推理规则,非常适合解决我们之前提到的问题。在本课题中,自适应神经模糊推理系统,被用于学习如何修正分割点位置,来补偿人工分割与机器分割间的差异和隐马尔科夫模型本身所产生的系统分割误差。整个实验分为两步:第一步,上下文无关的HMM被用于获得初始的语音边界。第二步,训练好的自适应神经模糊推理系统用于修正第一步所得到的分割边界。实验使用TIMIT数据库。实验的结果表明,自适应神经模糊推理系统,可以显著的提高,基于隐马尔科夫的自动语音分割技术精确度。在TIMIT语音库中,以20毫秒容忍度为评价标准,自适应神经模糊推理系统使得精确度从86.25%提高92.08%。这也证明了自适应神经模糊推理系统在语音分割中的有效性。此外,我们的方法更加易于构建和应用。未来,我们要继续提高系统精确度,并将其应用于其它数据库。
[Abstract]:Modern speech technology and research need high accuracy and high reliability of speech segmentation. Manual segmentation is always considered to be the most reliable and accurate method. However, the manual segmentation method is not only time-consuming and laborious, but also must be implemented by speech experts. This was a fatal flaw in big data's time, especially for large-scale speech banks. Therefore, it is necessary to develop automatic speech segmentation technology with high accuracy. The most important automatic speech segmentation technique is called forced calibration. In this method, the hidden Markov model (HMM) is used to construct different phoneme models. The speech signal is extracted into a set of feature vectors. The model can get the approximate phonemes boundary, but the results are not accurate. The traditional forced calibration system based on hidden Markov model is calculated with 20 millisecond tolerance in the TIMIT speech corpus, and the accuracy is between 80% and 89%. Up to now, many methods have been proposed to improve the automatic speech segmentation based on Hidden Markov. Some researchers have realized that the difference between automatic speech segmentation based on hidden Markov and artificial speech segmentation is that speech experts have knowledge of speech segmentation. Fuzzy logic can directly transform this knowledge into fuzzy rules that can be used in computers. However, fuzzy rules need to be carefully designed by experts, and the completeness of the rules cannot be guaranteed. To solve these problems, a more suitable method is proposed, which is the purpose of this study. Adaptive neural fuzzy inference system (ANFIS) is a machine learning method combining neural network and fuzzy inference system. Compared with other machine learning methods, it has the advantages of neural network and fuzzy inference system, and has better performance. Its advantages: simple, nonlinear, fuzzy reasoning rules, very suitable to solve the problems we mentioned earlier. In this paper, the adaptive neural fuzzy inference system is used to learn how to correct the location of segmentation points to compensate for the difference between manual segmentation and machine segmentation and the system segmentation error caused by Hidden Markov Model itself. The whole experiment is divided into two steps: first, context-free HMM is used to obtain the initial speech boundary. In the second step, the trained adaptive neural fuzzy inference system is used to modify the segmentation boundary obtained from the first step. The experiment uses TIMIT database. The experimental results show that the adaptive neural fuzzy inference system can significantly improve the accuracy of automatic speech segmentation based on Hidden Markov. In the TIMIT corpus, the adaptive neurofuzzy inference system can improve the accuracy from 86.25% to 92.08 by using 20 millisecond tolerance as the evaluation criterion. It also proves the effectiveness of adaptive neural fuzzy inference system in speech segmentation. In addition, our approach is easier to build and apply. In the future, we will continue to improve the accuracy of the system and apply it to other databases.
【学位授予单位】:天津大学
【学位级别】:硕士
【学位授予年份】:2014
【分类号】:TN912.3

【相似文献】

相关期刊论文 前10条

1 刘震;王厚军;龙兵;张治国;;一种基于加权隐马尔可夫的自回归状态预测模型[J];电子学报;2009年10期

2 李成,宋执环,李平;基于小波域隐马尔可夫树模型的过程趋势分析[J];信息与控制;2005年03期

3 杨兵,谢维信;基于基因算法的隐马尔可夫模型参数估计[J];系统工程与电子技术;2002年07期

4 孙俊喜,赵永明,陈亚珠;基于小波域隐马尔可夫树模型的超声图象贝叶斯去噪[J];中国图象图形学报;2003年06期

5 周越,许晴;基于隐马尔可夫复合树模型的图像纹理分析[J];数据采集与处理;2004年04期

6 王华华;周越;杨杰;戈新良;;基于正交余弦变换域概率主成分分析的嵌入隐马尔可夫人脸识别模型[J];上海交通大学学报;2007年06期

7 景明利;周雪芹;;基于小波域的隐马尔可夫树模型的图像去噪[J];昆明理工大学学报(理工版);2008年05期

8 彭玲,赵忠明,马江林;基于树状分解隐马尔可夫树的纹理分类模型研究[J];武汉科技大学学报(自然科学版);2004年04期

9 江艳霞;周宏仁;敬忠良;;基于拉普拉斯脸和隐马尔可夫的视频人脸识别[J];计算机工程;2007年01期

10 苏涛,张登福,毕笃彦;基于小波域分类隐马尔可夫树模型的图像去噪[J];红外与激光工程;2005年02期

相关硕士学位论文 前7条

1 苗聪聪;基于隐马尔可夫树模型与旋转不变性的遥感图像纹理检索方法研究[D];中国矿业大学;2015年

2 李远林;基于连续隐马尔可夫的兰州PM_(10)污染提前24小时预测研究[D];兰州大学;2016年

3 董良;基于自适应神经模糊推理与隐马尔可夫的语音分割研究[D];天津大学;2014年

4 钟微;基于隐马尔可夫协议分析的无线网络入侵检测技术研究[D];电子科技大学;2013年

5 马晶晶;基于隐马尔可夫理论的驾驶意图辨识研究[D];长沙理工大学;2012年

6 韩景灵;基于协议的隐马尔可夫网络入侵检测系统研究[D];山西大学;2007年

7 葛馨远;小波域HMT模型的应用研究[D];华北电力大学(北京);2009年



本文编号:2392821

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/wltx/2392821.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户2d3af***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com