当前位置:主页 > 科技论文 > 网络通信论文 >

语音识别技术的关键问题研究

发布时间:2018-05-03 02:30

  本文选题:语音识别 + 信号采集 ; 参考:《陕西师范大学》2014年硕士论文


【摘要】:随着全球一体化的不断发展,国家和区域之间的经济贸易交流越来越多,同时个体的活动范围也正不断的从本地走向世界,然而语言的交流却成为阻碍发展的一大障碍。计算机技术和信息技术的不断发展使得计算机作为辅助人类交流的中间工具正迅速的发展起来,如何利用新的技术使得交流从复杂到简单,从抽象到通俗成为人们所关心的问题。 语音识别(Speech Recognition)是模式识别技术的一个重要分支,它以语音信号为研究对象,以实现人机交互的目的,主要研究包括计算机技术、信号处理、模式识别语言学等多个领域的一门交叉学科。在最近的几十年内语音识别成为人和机器,人和人之间流畅沟通的重要桥梁。虽然语音识别技术在各行各业的使用范围已经非常广泛,识别的质量和识别效率也有很大的提高,但由于语音的人为因素、环境因素和语音识别算法等众多因素的制约,完全100%的识别目前仍是不可能达到的。 本文从影响语音识别的内外部因素出发,研究语音识别技术的关键技术和问题并探讨如何提高语音识别的识别率。第一部分从影响语音识别的人为因素出发对影响识别准确率的样本采集方面进行分析:语音识别的对象是不同的个体所发出来的信号源,因而个体的多样性和特殊性就决定了同样的一句话就会有不同的信号输入。本文从个体的地域特征、个人的性别和生理特征以及个体的说话方式情感表达等的不同角度来分析人为因素对语音识别的影响。第二部分从外界环境对语音信号采集的影响进行深入探讨:语音信号从发音者发出来之后被语音识别设备所采集,在此过程中也存在着不定的外界因素,如信号采集过程的设备噪音、采集环境下的偶发噪音等外界因素对信号的采集有很大的影响,这些影响会直接导致语音信号训练和识别结果的不正确。第三部分从语音识别过程的算法和识别模型方法的角度探讨目前流行的各种算法和技术方法。在语音识别过程中有很多种算法,在信号处理的前期阶段关键方法和算法主要有:语音信号的预加重、语音信号的加窗处理、短时平均能量、短时平均幅度函数、短时过零率、短时自相关的分析、短时能量和零差分端点检测算法等。在语音识别中,特征参数的提取是识别准确率高低的一个重要部分,特征参数的好坏取决于能否完全表达信号所有信息的指标。目前流行的特征参数方法有线性预测系数(LPC)、线性预测倒谱系数(LPCC)和Mel频率倒谱系数(MFCC)等。识别模型方法是语音识别技术的另一个重要环节:其主要有动态时间规整(DTW)、隐马尔科夫模型(HMM)、矢量量化(VQ)等。 本文通过设计语音识别系统对大噪音环境的语音信号的使用滤波的噪音处理方法,并以MFCC作为特征参数,使用VQ和HMM两种识别模型来分别观察实验结果分析语音识别效果。
[Abstract]:With the development of global integration, there are more and more economic and trade exchanges between countries and regions. At the same time, the scope of individual activities is constantly moving from local to the world. However, language exchange has become a major obstacle to development. With the development of computer technology and information technology, computer is developing rapidly as an intermediate tool to assist human communication. How to use new technology to make communication from complex to simple, From abstract to popular, people are concerned about it. Speech recognition is an important branch of pattern recognition technology. It takes speech signal as the research object to achieve the purpose of human-computer interaction. The main research includes computer technology, signal processing, speech recognition, speech recognition, speech recognition, speech recognition, speech recognition, speech recognition, speech recognition, speech recognition, speech recognition, speech recognition, speech recognition and speech recognition. Pattern recognition Linguistics is an interdisciplinary discipline in many fields. In recent decades, speech recognition has become an important bridge between people and machines, people and people. Although speech recognition technology has been widely used in various industries, the quality and efficiency of recognition have been greatly improved, but due to the human factors of speech, environmental factors, speech recognition algorithm and many other factors constraints, Full 100% recognition is still impossible. Based on the internal and external factors affecting speech recognition, this paper studies the key technologies and problems of speech recognition and discusses how to improve the recognition rate of speech recognition. The first part analyzes the human factors that affect the accuracy of speech recognition: the object of speech recognition is the signal source from different individuals. Therefore, the diversity and particularity of individuals determine that the same sentence will have different input signals. In this paper, the influence of human factors on speech recognition is analyzed from different perspectives, such as individual regional characteristics, individual gender and physiological characteristics, and individual speech style, emotional expression and so on. In the second part, the influence of the external environment on the speech signal acquisition is deeply discussed: the speech signal is collected by the speech recognition equipment after the voice signal is sent out, and there are also some uncertain external factors in the process. The external factors such as the equipment noise in the signal acquisition process and the occasional noise in the acquisition environment have great influence on the signal acquisition. These influences will directly lead to the incorrect results of speech signal training and recognition. In the third part, some popular algorithms and techniques are discussed from the point of view of speech recognition algorithm and recognition model method. In the process of speech recognition, there are many kinds of algorithms. In the early stage of signal processing, the key methods and algorithms are: prestress of speech signal, windowing processing of speech signal, short time average energy, short time average amplitude function, short time zero crossing rate, short time average energy, short time average amplitude function, short time zero crossing rate. Short-time autocorrelation analysis, short-time energy and zero-difference endpoint detection algorithm. In speech recognition, the extraction of feature parameters is an important part of recognition accuracy, and the quality of feature parameters depends on whether or not they can fully express all the information of the signal. At present, the popular characteristic parameter methods are linear prediction coefficient (LPCC), linear predictive cepstrum coefficient (LPCC) and Mel frequency cepstrum coefficient (MFCC). Recognition model method is another important part of speech recognition technology: dynamic time warping (DTW), Hidden Markov Model (hmm), Vector quantization (VQ) and so on. In this paper, we design a noise processing method using filtering for speech signals in a noisy environment by designing a speech recognition system. With MFCC as the characteristic parameter, two recognition models, VQ and HMM, are used to observe the experimental results and analyze the speech recognition effect.
【学位授予单位】:陕西师范大学
【学位级别】:硕士
【学位授予年份】:2014
【分类号】:TN912.34

【参考文献】

相关期刊论文 前10条

1 马志欣;王宏;李鑫;;语音识别技术综述[J];昌吉学院学报;2006年03期

2 史东承;韩玲艳;于明会;;基于HMM/SVM的音频自动分类[J];长春工业大学学报(自然科学版);2008年02期

3 杨大利,徐明星,吴文虎;噪音环境下的语音识别研究[J];计算机工程与应用;2003年20期

4 何湘智;语音识别的研究与发展[J];计算机与现代化;2002年03期

5 张玲华;郑宝玉;杨震;;基于LPC分析的语音特征参数研究及其在说话人识别中的应用[J];南京邮电学院学报;2005年06期

6 李宇明;权威方言在语言规范中的地位[J];清华大学学报(哲学社会科学版);2004年05期

7 舒倩;李银国;;基于MFCC0的语音端点检测方法[J];通信技术;2007年11期

8 文翰;黄国顺;;语音识别中DTW算法改进研究[J];微计算机信息;2010年19期

9 王金明,张雄伟;话者识别系统中语音特征参数的研究与仿真[J];系统仿真学报;2003年09期

10 禹琳琳;;语音识别技术及应用综述[J];现代电子技术;2013年13期



本文编号:1836593

资料下载
论文发表

本文链接:https://www.wllwen.com/kejilunwen/wltx/1836593.html


Copyright(c)文论论文网All Rights Reserved | 网站地图 |

版权申明:资料由用户0341d***提供,本站仅收录摘要或目录,作者需要删除请E-mail邮箱bigeng88@qq.com