优化前馈神经网络结构的正则化方法

发布时间：2021-10-25 14:35

　　近年来,寻找最合适的前馈神经网络（FNN）架构引起了极大的关注。一些研究提出了一些自动的方法来找到一个小而充足的网络结构,无需额外的再培训和修正。正则化项经常被引入学习过程,并且已被证明能有效的提高泛化性能并减小网络尺寸。特别地,Lp正则化在网络训练中用于惩罚过大的权值范数。L1和L1/2正则化是两种最流行的Lp正则化方法。然而,通常Lp正则化主要用于修剪冗余权值。换句话说,L1和L1/2正则化不能在单元层面改进稀疏性。在本文中,我们考虑上述问题。首先,我们考察了一种Group Lasso正则化方法,直接处理由每个隐层神经元的输出权值向量的范数。作为比较,普通的的Lasso正则化方法只是用于网络的标准误差函数中,单独处理每个权值。数值结果表明,对于每个基准数据集,我们提出的隐层正则化方法比的Lasso正则化方法都能修剪更多的冗余隐层神经元。但是,拉索组正规化可以修剪冗余隐节点,但不能修剪神经网络的幸存隐节点的任何冗余权重。接下来,我们提出了一种组L1/2正则化方法（记为GL1/2）,把从每个隐节点发出的输出权值向量作为一个组,来修剪隐节点。它的优点是不仅可以修剪冗余的隐节点,还可以修剪...

【文章来源】：大连理工大学辽宁省 211工程院校 985工程院校教育部直属院校

【文章页数】：105 页

【学位级别】：博士

【文章目录】：
ABSTRACT
摘要
1 Introduction
    1.1 Artificial neural networks
        1.1.1 Historical development of artificial neural networks
        1.1.2 Biological neurons
        1.1.3 Artificial neuron model and its basic elements
        1.1.4 Activation functions
    1.2 Learning mechanisms in artificial neural networks
        1.2.1 Supervised learning
        1.2.2 Unsupervised learning
        1.2.3 Batch and online gradient descent methods
    1.3 Artificial neural network architectures
        1.3.1 Feedforward neural network
        1.3.2 Recurrent neural network
    1.4 Problem statement
    1.5 Objectives of the thesis
    1.6 Outline of the thesis
2 Background: Methods for optimizing artificial neural network architecture
    2.1 Network growing method
    2.2 Network pruning method
        2.2.1 Sensitivity analysis methods
        2.2.2 Penalty (Regularization) methods
        2.2.3 Batch gradient method with L_(1/2) regularization term
3 Group lasso regularization method for pruning hidden layer nodes of feedforward neural networks
    3.1 Introduction
    3.2 Neural network structure and batch gradient method without any regularization term
    3.3 Batch gradient method with hidden layer regularization terms
        3.3.1 Batch gradient method with lasso regularization term
        3.3.2 Batch gradient method with Group Lasso regularization term
    3.4 Datasets
        3.4.1 K-fold cross-validation method
        3.4.2 Data normalization
    3.5 Hidden neuron selection criterion
    3.6 Reults
        3.6.1 The iris results
        3.6.2 The zoo results
        3.6.3 The seeds results
        3.6.4 The ionosphere results
    3.7 Discussion
4 Group L_(1/2) regularization method for pruning hidden layer nodes of feedforward neural network
    4.1 Introduction
    4.2 Feedforward neural network and batch gradient algorithm
    4.3 GL_2,GL_(1/2) and SGL_(1/2) regularizations for hidden nodes
        4.3.1 Batch gradient method with GL_2 regularization
        4.3.2 Batch gradient method with Group L_(1/2) regularization
        4.3.3 Batch gradient method with smooth Group L_(1/2) regularization
    4.4 A convergence theorem
    4.5 Simulation results
        4.5.1 Iris datasets
        4.5.2 Balance scale datasets
        4.5.3 Ecoli datasets
        4.5.4 Lymphography datasets
        4.5.5 Why can GL_(1/2) prune the redundant weights of the surviving hidden nodes?
    4.6 Proofs
5 Conclusion and Future Work
    5.1 Conclusion
    5.2 Future work
    5.3 Abstract of innovation points
References
Published academic articles during Ph.D. period
Acknowledgements
Author Bio

【参考文献】：
期刊论文
[1]稳健Lq（0<q<1）正则化理论:解的渐近分布与变量选择一致性[J]. 常象宇,徐宗本,张海,王建军,梁勇.  中国科学:数学. 2010(10)
[2]L1/2 regularization[J]. XU ZongBen 1 , ZHANG Hai 1,2 , WANG Yao 1 , CHANG XiangYu 1 & LIANG Yong 3 1 Institute of Information and System Science, Xi’an Jiaotong University, Xi’an 710049, China;2 Department of Mathematics, Northwest University, Xi’an 710069, China;3 University of Science and Technology, Macau 999078, China.  Science China（Information Sciences）. 2010(06)
[3]CONVERGENCE OF ONLINE GRADIENT METHOD WITH A PENALTY TERM FOR FEEDFORWARD NEURAL NETWORKS WITH STOCHASTIC INPUTS[J]. 邵红梅,吴微,李峰.  Numerical Mathematics A Journal of Chinese Universities（English Series）. 2005(01)

本文编号：3457602

资料下载

论文发表

支付宝下载

Download by Alipay
微信下载

Download by Wechat
会员下载

Download by Member

本文链接：https://www.wllwen.com/shoufeilunwen/xxkjbs/3457602.html

上一篇：基于空时压缩采样的谱估计研究
下一篇：采煤机截割部多级齿轮传动-箱体耦合动力学建模分析及动力学性能优化

论文发表

·知网|万方|维普|龙源|省级|国家级|科技核心|北大核心|南大核心CSSCI|EI|SCI|SSCI|