基于可重构的语音识别片上系统的设计

发布时间：2019-03-21 20:22

【摘要】：近年来,嵌入式系统的语音识别系统已经广泛应用到智能家居、工业控制、移动终端等领域,正改变着人们的生活。由于语言交流是人们之间最自然的交流方式,基于语音识别的人机交互的嵌入式系统越来越成为研究的热点。然而,现有的语音识别系统或具有很高的CPU使用率,不能完成其它任务;或具有很大的体积,难以在嵌入式系统使用;或网络依赖性太高,在无网络条件下仅能完成有限词汇量的识别。为了解决这些问题,在嵌入式语音识别方面还需要对系统结构进行深入的研究。本文提出基于可重构的片上语音识别系统,在一定程度上有效缓解了上述矛盾。所作的主要工作如下:首先,本文研究了语音信号的信号处理。从信号处理的角度,讨论了在语音识别过程中用到关键技术的原理。这包括预加重、端点检测、特征提取等技术。其次,本文介绍了隐马尔可夫模型的基本原理以及高斯混合模型的基本原理。通过对隐马尔可夫模型的三个问题的论述,特别是高斯混合模型表示的隐马尔可夫模型的B参数的详细论述,解决了语音识别系统的训练及识别的原理问题。再次,本文以ZYNQ7000作为SOC设计平台,构建了嵌入式非特定人孤立词语音识别系统。在对ZYNQ7000的可重构性研究的基础上,本文一方面在前有的PC端训练软件的基础上,进一步将识别模型改进为基于高斯混合模型的隐马尔可夫模型(GMM-HMM),形成系统验证平台,为识别系统提供识别模板和硬件测试数据。这包括对训练和识别算法的研究及实现。还包括将系统中间数据转换成易于硬件测试的格式。另一方面,将识别算法移植到ZYNQ7000平台,实现了片上语音识别系统的构建。这包括通过对识别流程的评估,完成对识别系统进行了软硬件划分,并且完成对语音识别的关键算法作了适合硬件特性的改进。这还包括对关键计算单元的硬件重构,通过硬件逻辑实现数字信号处理中的常见算法。在本文中,主要研究了MFCC计算单元的重构。最后,通过对系统的识别率和实时性的测试,阐述了采用可重构片上语音识别系统优势以及对将来工作的展望。
[Abstract]:In recent years, embedded speech recognition system has been widely used in smart home, industrial control, mobile terminals and other fields, is changing people's lives. Because language communication is the most natural way of communication between people, the embedded system based on speech recognition has become more and more popular in the field of human-computer interaction. However, the existing speech recognition system either has a high CPU usage rate, can not accomplish other tasks, or has a large size, so it is difficult to use in embedded system. Or the network dependence is too high, can only complete the limited vocabulary identification under the condition of no network. In order to solve these problems, embedded speech recognition needs to be deeply studied. In this paper, a reconfigurable on-chip speech recognition system is proposed, which effectively alleviates the above contradictions to a certain extent. The main work is as follows: firstly, this paper studies the signal processing of speech signal. From the point of view of signal processing, the principle of key techniques used in speech recognition is discussed. This includes pre-weighting, endpoint detection, feature extraction and other techniques. Secondly, this paper introduces the basic principle of hidden Markov model and Gao Si mixed model. The training and recognition principle of speech recognition system is solved by discussing three problems of Hidden Markov Model, especially the B parameter of Hidden Markov Model represented by Gao Si's mixed model. Thirdly, using ZYNQ7000 as the design platform of SOC, the embedded speech recognition system for isolated words is constructed. On the basis of the research on the reconfiguration of ZYNQ7000, on the one hand, based on the previous PC training software, the recognition model is further improved to the hidden Markov model (GMM-HMM) based on Gao Si's mixed model to form a system verification platform. Provide identification template and hardware test data for identification system. This includes the research and implementation of training and recognition algorithms. It also includes converting the system intermediate data into a format that is easy to test with hardware. On the other hand, the recognition algorithm is transplanted to ZYNQ7000 platform to realize the construction of on-chip speech recognition system. Through the evaluation of the recognition process, the hardware and software partition of the recognition system is completed, and the improvement of the key algorithm of speech recognition is made suitable for the hardware characteristics. It also includes hardware reconfiguration of key computing units and implementation of common algorithms in digital signal processing through hardware logic. In this paper, the reconstruction of MFCC computing unit is studied. Finally, by testing the recognition rate and real-time performance of the system, the advantages of the reconfigurable on-chip speech recognition system and the prospect of future work are discussed.
【学位授予单位】：电子科技大学
【学位级别】：硕士
【学位授予年份】：2014
【分类号】：TN912.34

【参考文献】