Home > Publications database > Ensemble Kalman Filter Optimizing Deep Neural Networks: An Alternative Approach to Non-performing Gradient Descent > print |
001 | 889208 | ||
005 | 20210127115307.0 | ||
024 | 7 | _ | |a 10.1007/978-3-030-64580-9_7 |2 doi |
024 | 7 | _ | |a 2128/26777 |2 Handle |
037 | _ | _ | |a FZJ-2021-00117 |
041 | _ | _ | |a English |
100 | 1 | _ | |a Yegenoglu, Alper |0 P:(DE-Juel1)161462 |b 0 |e Corresponding author |u fzj |
111 | 2 | _ | |a The Sixth International Conference on Machine Learning, Optimization, and Data Science |g LOD2020 |c Siena |d 2020-07-19 - 2020-07-22 |w Italy |
245 | _ | _ | |a Ensemble Kalman Filter Optimizing Deep Neural Networks: An Alternative Approach to Non-performing Gradient Descent |
250 | _ | _ | |a 5th ed. |
260 | _ | _ | |a Cham |c 2020 |b Springer |
295 | 1 | 0 | |a Machine Learning, Optimization, and Data Science |
300 | _ | _ | |a 78-92 |
336 | 7 | _ | |a CONFERENCE_PAPER |2 ORCID |
336 | 7 | _ | |a Conference Paper |0 33 |2 EndNote |
336 | 7 | _ | |a INPROCEEDINGS |2 BibTeX |
336 | 7 | _ | |a conferenceObject |2 DRIVER |
336 | 7 | _ | |a Output Types/Conference Paper |2 DataCite |
336 | 7 | _ | |a Contribution to a conference proceedings |b contrib |m contrib |0 PUB:(DE-HGF)8 |s 1610561372_10824 |2 PUB:(DE-HGF) |
336 | 7 | _ | |a Contribution to a book |0 PUB:(DE-HGF)7 |2 PUB:(DE-HGF) |m contb |
490 | 0 | _ | |a Lecture Notes in Computer Science |v 12566 |
520 | _ | _ | |a The successful training of deep neural networks is dependent on initialization schemes and choice of activation functions. Non-optimally chosen parameter settings lead to the known problem of exploding or vanishing gradients. This issue occurs when gradient descent and backpropagation are applied. For this setting the Ensemble Kalman Filter (EnKF) can be used as an alternative optimizer when training neural networks. The EnKF does not require the explicit calculation of gradients or adjoints and we show this resolves the exploding and vanishing gradient problem. We analyze different parameter initializations, propose a dynamic change in ensembles and compare results to established methods. |
536 | _ | _ | |a 511 - Computational Science and Mathematical Methods (POF3-511) |0 G:(DE-HGF)POF3-511 |c POF3-511 |f POF III |x 0 |
536 | _ | _ | |a SMHB - Supercomputing and Modelling for the Human Brain (HGF-SMHB-2013-2017) |0 G:(DE-Juel1)HGF-SMHB-2013-2017 |c HGF-SMHB-2013-2017 |f SMHB |x 1 |
536 | _ | _ | |a CSD-SSD - Center for Simulation and Data Science (CSD) - School for Simulation and Data Science (SSD) (CSD-SSD-20190612) |0 G:(DE-Juel1)CSD-SSD-20190612 |c CSD-SSD-20190612 |x 2 |
536 | _ | _ | |a SLNS - SimLab Neuroscience (Helmholtz-SLNS) |0 G:(DE-Juel1)Helmholtz-SLNS |c Helmholtz-SLNS |x 3 |
536 | _ | _ | |a HDS LEE - Helmholtz School for Data Science in Life, Earth and Energy (HDS LEE) (HDS-LEE-20190612) |0 G:(DE-Juel1)HDS-LEE-20190612 |c HDS-LEE-20190612 |x 4 |
536 | _ | _ | |a PhD no Grant - Doktorand ohne besondere Förderung (PHD-NO-GRANT-20170405) |0 G:(DE-Juel1)PHD-NO-GRANT-20170405 |c PHD-NO-GRANT-20170405 |x 5 |
536 | _ | _ | |a HAF - Helmholtz Analytics Framework (ZT-I-0003) |0 G:(DE-HGF)ZT-I-0003 |c ZT-I-0003 |x 6 |
700 | 1 | _ | |a Krajsek, Kai |0 P:(DE-Juel1)129347 |b 1 |u fzj |
700 | 1 | _ | |a Diaz, Sandra |0 P:(DE-Juel1)165859 |b 2 |u fzj |
700 | 1 | _ | |a Herty, Michael |0 P:(DE-HGF)0 |b 3 |
773 | _ | _ | |a 10.1007/978-3-030-64580-9_7 |
856 | 4 | _ | |u https://juser.fz-juelich.de/record/889208/files/main.pdf |y OpenAccess |
909 | C | O | |o oai:juser.fz-juelich.de:889208 |p openaire |p open_access |p VDB |p driver |p dnbdelivery |
910 | 1 | _ | |a Forschungszentrum Jülich |0 I:(DE-588b)5008462-8 |k FZJ |b 0 |6 P:(DE-Juel1)161462 |
910 | 1 | _ | |a Forschungszentrum Jülich |0 I:(DE-588b)5008462-8 |k FZJ |b 1 |6 P:(DE-Juel1)129347 |
910 | 1 | _ | |a Forschungszentrum Jülich |0 I:(DE-588b)5008462-8 |k FZJ |b 2 |6 P:(DE-Juel1)165859 |
913 | 1 | _ | |a DE-HGF |b Key Technologies |l Supercomputing & Big Data |1 G:(DE-HGF)POF3-510 |0 G:(DE-HGF)POF3-511 |3 G:(DE-HGF)POF3 |2 G:(DE-HGF)POF3-500 |4 G:(DE-HGF)POF |v Computational Science and Mathematical Methods |x 0 |
914 | 1 | _ | |y 2020 |
915 | _ | _ | |a OpenAccess |0 StatID:(DE-HGF)0510 |2 StatID |
920 | _ | _ | |l yes |
920 | 1 | _ | |0 I:(DE-Juel1)JSC-20090406 |k JSC |l Jülich Supercomputing Center |x 0 |
980 | _ | _ | |a contrib |
980 | _ | _ | |a VDB |
980 | _ | _ | |a UNRESTRICTED |
980 | _ | _ | |a contb |
980 | _ | _ | |a I:(DE-Juel1)JSC-20090406 |
980 | 1 | _ | |a FullTexts |
Library | Collection | CLSMajor | CLSMinor | Language | Author |
---|