当前位置:主页 > 科技论文 > 网络通信论文 >

语音身份与内容同时识别技术及其应用研究

发布时间:2018-04-16 11:41

  本文选题:语音内容识别 + 语音身份识别 ; 参考:《江南大学》2015年硕士论文


【摘要】:随着计算机技术的广泛应用,语音识别技术逐渐成为当前研究热点之一。语音是人机交互中最自然的一种方式,而语音识别技术是人机语音交互的关键所在。对于特定的应用场合,需要同时识别语音身份与内容,并要求识别算法适合于嵌入式系统,,如车载系统、智能家居等。本文主要研究了语音身份与内容同时识别技术,并将其应用于智能家居环境下的语音控制系统中。本文主要工作内容包括: (1)研究了语音信号的端点检测与特征提取技术,用于完成语音信号的预处理。探究了几种常见的语音自适应方法,并深入研究了Herbig等人于2011年提出的语音身份与内容同时识别机制,用于实现语音身份与内容同时识别。 (2)结合集成学习与语音识别,实现了基于Bagging与GMM的语音内容识别方法,从而提高了语音内容识别率与识别率稳定性。针对资源有限的嵌入式系统,基于SQ(Soft Quantization)集成多个语音内容识别模型,有效的降低了识别模型的空间复杂度,使得语音内容识别系统更适用于嵌入式环境。与利用传统的投票选择集成方法相比,该方法在集成模型数量较少的情况下,还能够提高语音识别系统的识别率与稳定性。为了实现说话者群与语音内容同时识别,利用SQ集成说话者群模型与语音内容识别模型,实时计算每一帧语音信号的最优解码器,同时对SQ得分最高的模型投票。通过模型的得票率比较完成说话者群识别,同时利用最优解码器完成语音内容识别。实验中,当语音内容识别模型的集成数达到6个时,语音内容平均识别率为88%,说话者群平均识别率为81.56%。实验结果证实了特定应用场合下说话者群与语音内容同时识别的可行性。 (3)本文利用说话者群与语音内容同时识别算法,实现了智能家居环境下的语音身份与内容同时识别系统。实验中,当语音内容识别模型的集成数达到5个时,语音内容识别率达到了96.64%,说话者群识别率为88.24%。实验结果表明该方法适用于智能家居环境下的语音身份与内容同时识别。
[Abstract]:With the wide application of computer technology, speech recognition technology has gradually become one of the research hotspots.Speech is the most natural way in human-computer interaction, and speech recognition technology is the key of human-computer speech interaction.For specific applications, it is necessary to recognize the voice identity and content simultaneously, and the recognition algorithm is required to be suitable for embedded systems, such as vehicle system, smart home and so on.This paper mainly studies the technology of simultaneous recognition of speech identity and content, and applies it to the speech control system in the environment of smart home.The main contents of this paper are as follows:1) Endpoint detection and feature extraction of speech signal are studied, which is used to preprocess speech signal.This paper probes into several common speech adaptive methods, and deeply studies the simultaneous recognition mechanism of speech identity and content proposed by Herbig et al in 2011, which is used to realize simultaneous recognition of speech identity and content.2) the method of speech content recognition based on Bagging and GMM is realized by integrating integrated learning and speech recognition, which improves the rate of speech content recognition and the stability of recognition rate.For the embedded system with limited resources, multiple speech content recognition models are integrated based on SQ(Soft quantity, which effectively reduces the spatial complexity of the recognition model and makes the speech content recognition system more suitable for embedded environment.Compared with the traditional method of voting selection, this method can improve the recognition rate and stability of speech recognition system under the condition that the number of integrated models is less.In order to realize the simultaneous recognition of speaker group and speech content, the speaker group model and speech content recognition model are integrated by sq, and the optimal decoder of each frame of speech signal is calculated in real time. At the same time, the model with the highest score of sq is voted.The speaker group recognition is completed by comparing the votes of the model and the speech content recognition is accomplished by the optimal decoder.In the experiment, when the number of speech content recognition models reaches 6, the average recognition rate of speech content is 88 and the average recognition rate of speakers is 81.56.The experimental results demonstrate the feasibility of simultaneous recognition of speaker groups and speech content in specific applications.In this paper, the speaker group and speech content simultaneous recognition algorithm is used to realize the simultaneous recognition system of speech identity and content in smart home environment.In the experiment, when the integration number of speech content recognition model reaches 5, the speech content recognition rate reaches 96.64 and the speaker group recognition rate is 88.24.Experimental results show that this method is suitable for simultaneous recognition of speech identity and content in smart home environment.
【学位授予单位】:江南大学
【学位级别】:硕士
【学位授予年份】:2015
【分类号】:TN912.34

【二级参考文献】

相关期刊论文 前1条

1 黄昊;郭立;李琳;;基于感知敏感成分划分的语音时长规整算法[J];数据采集与处理;2008年06期



本文编号:1758775

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/wltx/1758775.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户13305***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com