激活函数导向的RNN算法优化

发布时间：2017-12-27 21:09

本文关键词：激活函数导向的RNN算法优化　出处：《浙江大学》2017年硕士论文　论文类型：学位论文

【摘要】：循环神经网络(RecurrentNeuralNetworks,RNN)是人工神经网络的一个重要分支。它在隐含层引入了反馈机制,实现对序列数据的有效处理。循环神经网络具有存储和处理上下文信息的强大能力,成为语音识别、自然语言处理、计算机视觉等领域的研究热点之一。一方面,循环神经网络普遍采用S型函数作为激活函数,而S型函数的饱和区限制了RNN训练收敛速度,因此对激活函数的优化研究成为研究热点。另一方面,循环神经网络主要采用软件实现的方式,算法的硬件加速研究具有重要意义。本文针对上述问题和研究背景,在前人的研究基础上做了如下工作:循环神经网络的理论总结研究。长短时记忆单元((Long Short-Term Memory,LSTM)特有的门结构解决了传统循环神经网络时间维度的梯度消失问题,成为RNN结构的重要组成部分。分析了 LSTM型RNN的训练过程,包括前向传播过程和反向传播过程。在反向传播过程中,激活函数及其导数直接影响网络训练的收敛速度。从激活函数方面着手对循环神经网络算法进行优化。传统的S型激活函数存在饱和区导致收敛速度慢,前人提出的修正线性单元避免了饱和区梯度消失问题,但是带来了梯度爆炸问题。利用S型函数系数不同,非饱和区范围不同,进一步分析了不同系数之间的训练收敛速度的大小关系。通过实验证明了扩展非饱和区的优化方法有效地加快了训练收敛速度。从激活函数的硬件实现着手对循环神经网络算法进行优化,激活函数的硬件实现难度较大,具有更重要的研究意义。优化误差的研究,引入了拟合直线误差修正项,在硬件开销不变的前提下,误差变为原来的二分之一;优化分割方法的研究,调整不同子区间的分割段数,在硬件开销不变的前提下,使得误差进一步减小;激活函数硬件实现的可扩展性研究,基于Sigmoid函数的硬件实现,实现了参数化Sigmoid函数和Tanh函数。
[Abstract]:RecurrentNeuralNetworks (RecurrentNeuralNetworks, RNN) is an important branch of artificial neural network. It introduces the feedback mechanism in the hidden layer to realize the effective processing of the sequence data. Recurrent neural network has strong ability to store and process context information, and has become one of the research hotspots in speech recognition, Natural Language Processing, computer vision and other fields. On the one hand, recurrent neural network generally uses S function as activation function, while the saturation area of S function restricts the convergence speed of RNN training. Therefore, the optimization research of activation function has become a research hotspot. On the other hand, recurrent neural network mainly adopts the way of software implementation, and the research of hardware acceleration of the algorithm is of great significance. In view of the above problems and research background, the following work has been done on the basis of previous studies: the theoretical summary of recurrent neural network. The Long Short-Term Memory (LSTM) unique gate structure solves the gradient disappearance problem of the traditional recurrent neural network time dimension and becomes an important part of RNN structure. The training process of LSTM type RNN, including forward propagation process and reverse propagation process, is analyzed. In the process of backpropagation, the activation function and its derivative directly affect the convergence speed of network training. Starting from the activation function, the recurrent neural network algorithm is optimized. The traditional S activation function has saturation region, which leads to slow convergence. The corrected linear element proposed by predecessors avoids the problem of gradient disappearance in saturated region, but it brings gradient explosion problem. Using the different coefficients of the S type function and the different unsaturated zone range, the relation between the speed of training convergence between different coefficients is further analyzed. It is proved by experiments that the optimization method of the extended unsaturated zone can effectively speed up the convergence speed of training. Starting from the hardware realization of the activation function, the recurrent neural network algorithm is optimized. The hardware realization of the activation function is difficult, and it has more important research significance. Study on Optimization of error, introduced the linear fitting error correction, unchanged in hardware overhead, the error into the original 1/2; optimization of segmentation method, different sub interval segments, unchanged in hardware overhead, the error is further reduced; the activation function of the hardware implementation can extended research, realize Sigmoid function based on hardware realization of parameterized Sigmoid function and Tanh function.
【学位授予单位】：浙江大学
【学位级别】：硕士
【学位授予年份】：2017
【分类号】：TP183

【参考文献】