TY - CONF
AU - Yegenoglu, Alper
AU - Krajsek, Kai
AU - Diaz, Sandra
AU - Herty, Michael
TI - Ensemble Kalman Filter Optimizing Deep Neural Networks: An Alternative Approach to Non-performing Gradient Descent; 5th ed.
VL - 12566
CY - Cham
PB - Springer
M1 - FZJ-2021-00117
T2 - Lecture Notes in Computer Science
SP - 78-92
PY - 2020
AB - The successful training of deep neural networks is dependent on initialization schemes and choice of activation functions. Non-optimally chosen parameter settings lead to the known problem of exploding or vanishing gradients. This issue occurs when gradient descent and backpropagation are applied. For this setting the Ensemble Kalman Filter (EnKF) can be used as an alternative optimizer when training neural networks. The EnKF does not require the explicit calculation of gradients or adjoints and we show this resolves the exploding and vanishing gradient problem. We analyze different parameter initializations, propose a dynamic change in ensembles and compare results to established methods.
T2 - The Sixth International Conference on Machine Learning, Optimization, and Data Science
CY - 19 Jul 2020 - 22 Jul 2020, Siena (Italy)
Y2 - 19 Jul 2020 - 22 Jul 2020
M2 - Siena, Italy
LB - PUB:(DE-HGF)8 ; PUB:(DE-HGF)7
DO - DOI:10.1007/978-3-030-64580-9_7
UR - https://juser.fz-juelich.de/record/889208
ER -