基于云计算的神经网络并行实现及其学习方法研究

发布时间：2019-05-21 23:22

【摘要】：随着网络技术和软件技术及云计算技术的高速发展,当前数据正以海量的方式递增,并已经进入了大数据时代。真实世界数据,比如数码照片、基因表达谱、人脸数据集或网页文本,通常具有维数高和数据量大的特点。对于传统的人工智能技术和模式识别技术等都面临如何在大数据时代下实现数据处理的挑战。比如,对于大规模的人脸数据集分类,一台计算机或工作站因为缺乏速度和存储容量很难适应实际需求。为此,非常有必要研究在大数据环境下如何实现基于多计算机集群的人工智能技术和模式识别技术。当采用人工智能方法,比如利用神经网络对相关数据进行处理时,若训练样本的数量规模不大时,单个神经网络的泛化能力和运行时间是比较理想的。然而随着识别类别及数目增加,神经网络的结构也将变得更加复杂,导致神经网络训练时间变得更长,收敛速度变得更慢,容易陷入局部最小值和更差的泛化能力等。为了解决这些问题,本论文研究和设计了由多个神经网络组成的集成神经网络(Hybrid Neural Networks,HNNs)去代替复杂的单一神经网络,并且提出了一种新颖的半监督学习算法——嵌入Softmax回归的深度信念网络(Deep Belief Network Embedded with Softmax Regress,DBNESR)作为分类器的深度学习方法。本论文所做的主要贡献如下:(1)本文提出了一种在云计算集群上,基于Map-Reduce的多层神经网络并行实现方法。也即为了满足大数据处理的需要,本文提出了一种在云计算集群上,基于Map-Reduce的误差反传BP算法被训练的全连接多层神经网络的有效映射机制。针对一个在云计算集群上的并行BP算法和一个在单一处理机上的串行BP算法,从理论上推导了实现算法所需要的时间,并且评估了在云计算集群上的并行BP算法及性能参数(加速比、数据节点的最佳数目和最小数目等)。实验结果证明,本文提出的并行BP算法比现有的算法有更好的加速比和更快的收敛速率及更少的迭代次数。(2)本文提出了一种在云计算集群上,基于Map-Reduce的径向基函数神经网络的并行实现方法,并进行了情感计算等应用研究。也即借助于云计算平台,通过网络流通和组合提供的计算能力,实现了径向基函数神经网络及学习算法的并行训练和分类识别应用,从而使径向基函数神经网络能够进行跨平台的学习,以及处理人脸识别和语音识别及情感计算等海量的高维数据。实验结果表明,本文提出的算法比基于单一计算机的传统串行训练神经网络学习算法有更快的学习速度,更高的识别率,更大的数据处理能力。(3)本文提出了一种半监督学习算法——内嵌Softmax回归的深度信念网络(DBNESR),并且设计了多种基于监督学习的分类器:BP、HBPNNs、RBF、HRBFNNs、SVM、多分类决策融合分类器(Multiple Classification Decision Fusion Classifier,MCDFC)——集成HBPNNs-HRBFNNs-SVM分类器。实验结果表明,半监督深度算法DBNESR具有较佳的、较高、较稳定的识别率;半监督学习算法比所有的监督学习算法有更好的效果;集成神经网络比单一神经网络有更好的效果;平均识别率和方差分别为BPHBPNNs≈RBFHRBFNNs≈SVMMCDFCDBNESR和BPRBFHBPNNsHRBFNNsSVMMCDFCDBNESR;这反映了DBNESR具有模拟复杂人工智能任务的能力。
[Abstract]:With the rapid development of network technology and software technology and cloud computing technology, the current data is increasing in a massive way and has entered a large data era. Real-world data, such as digital photographs, gene expression profiles, face data sets, or web pages, typically have the characteristics of high dimensionality and large data volume. The traditional artificial intelligence technology and pattern recognition technology are faced with the challenge of how to realize the data processing in the big data age. For example, for large-scale face data sets, a computer or workstation is difficult to adapt to the actual needs because of a lack of speed and storage capacity. To this end, it is necessary to study how to realize the technology of artificial intelligence and pattern recognition based on the multi-computer cluster in the large data environment. When the number of training samples is not large, the generalization ability and the running time of a single neural network are ideal when the number of training samples is not large when the artificial intelligence method is adopted, such as using the neural network to process the related data. However, with the increase of the identification category and number, the structure of the neural network will become more complex, resulting in the neural network training time becoming longer, the convergence speed becomes slower, the local minimum value and the worse generalization ability can be easily trapped. In order to solve these problems, this paper studies and designs an integrated neural network (HNNs) which is composed of a plurality of neural networks instead of a complex single neural network. A novel semi-supervised learning algorithm, Deep Belly Network Embedded with Softmax Repress (DBESSR), is proposed as the depth learning method of the classifier. The main contribution of this thesis is as follows: (1) This paper presents a multi-layer neural network parallel implementation method based on Map-Reduce on the cloud computing cluster. In order to satisfy the need of large data processing, this paper presents an effective mapping mechanism of a fully connected multi-layer neural network trained on the cloud computing cluster, based on the error back-propagation BP algorithm of Map-Reduce. Aiming at a parallel BP algorithm on a cloud computing cluster and a serial BP algorithm on a single processor, the time required for implementing the algorithm is derived theoretically, and the parallel BP algorithm and the performance parameter (acceleration ratio) on the cloud computing cluster are evaluated, The optimal number and the minimum number of data nodes, etc.). The experimental results show that the proposed parallel BP algorithm has better speedup and faster convergence rate and lower number of iterations than the existing algorithms. (2) In this paper, a parallel realization method of the radial basis function neural network based on Map-Reduce is proposed in the cloud computing cluster. in other words, by means of the computing capability provided by the cloud computing platform and through the network flow and the combination, the parallel training and classification identification applications of the radial basis function neural network and the learning algorithm are realized, so that the radial basis function neural network can carry out cross-platform learning, And processing the massive high-dimensional data such as face recognition and speech recognition and emotion calculation. The experimental results show that the algorithm proposed in this paper has a faster learning speed, higher recognition rate and greater data processing capacity than the traditional serial training neural network learning algorithm based on a single computer. (3) In this paper, a semi-supervised learning algorithm _ embedded Softmax regression depth belief network (DBNESR) is proposed, and a variety of supervised learning-based classifiers are designed: BP, HBPNNs, RBF, HRBFNs, SVMs, multi-classification decision fusion classifier (MCDFC) _ integrated HBPNNs-HRBNs-SVM classifier. The experimental results show that the semi-supervised depth algorithm DBNESR has better, higher and stable recognition rate, and the semi-supervised learning algorithm has better effect than all of the supervised learning algorithms, and the integrated neural network has better effect than the single neural network. The average recognition rate and variance are BPHBPNNs, RBFHRBFNNs, SVMMCDFCDBNESR and BPRBHBPNNsHRBFNNsSVMMCDFCDBNESR, respectively; this reflects the ability of the DBNESR to simulate complex artificial intelligence tasks.
【学位授予单位】：华南理工大学
【学位级别】：博士
【学位授予年份】：2015
【分类号】：TP183

【共引文献】