基于语音识别的家用服务机器人控制系统

发布时间：2018-08-28 08:15

【摘要】：本文主要研究语音识别的整个过程：主要由语音信号滤波、采样、量化、加窗、端点检测、特征提取、模型训练和阈值比较组成,以及通过Matlab实现对算法模型的仿真。同时通过Matlab的GUI设计技术实现了语音识别的交互界面。在语音识别理论的基础上,通过搭建的五个自由度的Arduino双臂机器人和ASR M08-A语音识别模块,实现了语音控制机器人完成各种规划动作。语音信号经过滤波、采样与量化得到离散的数字信号后,进行预加重,预加重的目的在于滤除低频干扰,提升输入信号的高频分量。分帧使得原本的信号变成一段一段的,相当于对原始信号时域内加了一个矩形窗。时域内与矩形窗相乘相当于频域内信号频谱与矩形窗的傅里叶变换进行卷积。双门限端点检测算法通过短时平均能量和过零率两个门限来实现语音信号的端点检测。实现语音信号的端点检测后通过美尔频率倒谱系数和一阶差分美尔频率倒谱系数获得语音信号的特征参数。同时提出了改进的语音信号特征参数提取算法,基于小波变换的线性预测倒谱系数的计算步骤,基于小波变换的美尔频率倒谱系数的计算步骤。最后,利用基于小波变换的线性预测倒谱系数(DWTL)及相应的差分参数(△DWTL)和基于小波变换的美尔频率倒谱系数(DWTM)及相应的差分参数(△DWTM)组成的系数矩阵。然后通过隐马尔科夫模型当中,前向后向算法、viterbi算法、Baum-welch算法实现模型训练。同时通过Matlab GUI设计以及回调函数的编写实现了语音识别仿真交互界面。在语音识别理论的基础上,与Arduino双臂机器人结合。五自由度服务机器人手臂通过坐标通用旋转变换算法实现机器人手臂的正运动学问题和逆运动学问题求解。正运动学问题是通过已知的机器人各个关节变量来求解末端执行器的位姿；逆运动学问题根据机器人末端执行器的位置和姿态要求,通过运动学逆解求得各个关节转角。然后运用ASR M08-A语音识别模块,32路舵机控制板、Arduino atmegal2560控制板、5自由度机械臂、实现语音控制机器人手臂动作。
[Abstract]:This paper mainly studies the whole process of speech recognition: it consists of speech signal filtering, sampling, quantization, windowing, endpoint detection, feature extraction, model training and threshold comparison, and the simulation of the algorithm model through Matlab. At the same time, the interactive interface of speech recognition is realized by GUI design technology of Matlab. On the basis of speech recognition theory, a five-degree-of-freedom Arduino dual-arm robot and a ASR M08-A speech recognition module are built to realize various planning actions of the speech control robot. After the speech signal is filtered, sampled and quantized, the discrete digital signal is pre-accentuated. The purpose of preemphasis is to filter out the low-frequency interference and enhance the high frequency component of the input signal. Framing makes the original signal a segment, which is equivalent to adding a rectangular window to the original signal. Multiplying with the rectangular window in the time domain is equivalent to convolution between the frequency spectrum of the signal and the Fourier transform of the rectangular window. The dual threshold endpoint detection algorithm realizes the endpoint detection of speech signal through two thresholds: the short time average energy and the zero crossing rate. After the endpoint detection of speech signal is realized, the characteristic parameters of speech signal are obtained by the number of Mel frequency cepstrum and the first-order differential Mel frequency cepstrum coefficient. At the same time, an improved speech signal feature parameter extraction algorithm is proposed. The calculation steps of linear predictive cepstrum coefficients based on wavelet transform and Mel frequency cepstrum coefficients based on wavelet transform are presented. Finally, the coefficient matrix of linear predictive cepstrum coefficient (DWTL) based on wavelet transform and corresponding difference parameter (DWTL), Mell-frequency cepstrum coefficient (DWTM) based on wavelet transform and the corresponding difference parameter (DWTM) are used. Then the model training is realized by the Baum-welch algorithm, which is a forward and backward algorithm, which is used in the hidden Markov model. At the same time, the interactive interface of speech recognition simulation is realized by the design of Matlab GUI and the writing of callback function. On the basis of speech recognition theory, combined with Arduino dual-arm robot. The forward kinematics problem and inverse kinematics problem of the robot arm are solved by the coordinate general rotation transformation algorithm. The forward kinematics problem is to solve the pose of the end actuator by the known joint variables, and the inverse kinematics problem can obtain the rotation angle of each joint according to the position and attitude requirements of the robot end actuator. Then, the ASR M08-A speech recognition module is used to control the robot arm with 5 degrees of freedom by using the 32 path steering gear control board and the Arduino atmegal2560 control board.
【学位授予单位】：广东工业大学
【学位级别】：硕士
【学位授予年份】：2014
【分类号】：TP242;TN912.34

【引证文献】