001     889208
005     20210127115307.0
024 7 _ |a 10.1007/978-3-030-64580-9_7
|2 doi
024 7 _ |a 2128/26777
|2 Handle
037 _ _ |a FZJ-2021-00117
041 _ _ |a English
100 1 _ |a Yegenoglu, Alper
|0 P:(DE-Juel1)161462
|b 0
|e Corresponding author
|u fzj
111 2 _ |a The Sixth International Conference on Machine Learning, Optimization, and Data Science
|g LOD2020
|c Siena
|d 2020-07-19 - 2020-07-22
|w Italy
245 _ _ |a Ensemble Kalman Filter Optimizing Deep Neural Networks: An Alternative Approach to Non-performing Gradient Descent
250 _ _ |a 5th ed.
260 _ _ |a Cham
|c 2020
|b Springer
295 1 0 |a Machine Learning, Optimization, and Data Science
300 _ _ |a 78-92
336 7 _ |a CONFERENCE_PAPER
|2 ORCID
336 7 _ |a Conference Paper
|0 33
|2 EndNote
336 7 _ |a INPROCEEDINGS
|2 BibTeX
336 7 _ |a conferenceObject
|2 DRIVER
336 7 _ |a Output Types/Conference Paper
|2 DataCite
336 7 _ |a Contribution to a conference proceedings
|b contrib
|m contrib
|0 PUB:(DE-HGF)8
|s 1610561372_10824
|2 PUB:(DE-HGF)
336 7 _ |a Contribution to a book
|0 PUB:(DE-HGF)7
|2 PUB:(DE-HGF)
|m contb
490 0 _ |a Lecture Notes in Computer Science
|v 12566
520 _ _ |a The successful training of deep neural networks is dependent on initialization schemes and choice of activation functions. Non-optimally chosen parameter settings lead to the known problem of exploding or vanishing gradients. This issue occurs when gradient descent and backpropagation are applied. For this setting the Ensemble Kalman Filter (EnKF) can be used as an alternative optimizer when training neural networks. The EnKF does not require the explicit calculation of gradients or adjoints and we show this resolves the exploding and vanishing gradient problem. We analyze different parameter initializations, propose a dynamic change in ensembles and compare results to established methods.
536 _ _ |a 511 - Computational Science and Mathematical Methods (POF3-511)
|0 G:(DE-HGF)POF3-511
|c POF3-511
|f POF III
|x 0
536 _ _ |a SMHB - Supercomputing and Modelling for the Human Brain (HGF-SMHB-2013-2017)
|0 G:(DE-Juel1)HGF-SMHB-2013-2017
|c HGF-SMHB-2013-2017
|f SMHB
|x 1
536 _ _ |a CSD-SSD - Center for Simulation and Data Science (CSD) - School for Simulation and Data Science (SSD) (CSD-SSD-20190612)
|0 G:(DE-Juel1)CSD-SSD-20190612
|c CSD-SSD-20190612
|x 2
536 _ _ |a SLNS - SimLab Neuroscience (Helmholtz-SLNS)
|0 G:(DE-Juel1)Helmholtz-SLNS
|c Helmholtz-SLNS
|x 3
536 _ _ |a HDS LEE - Helmholtz School for Data Science in Life, Earth and Energy (HDS LEE) (HDS-LEE-20190612)
|0 G:(DE-Juel1)HDS-LEE-20190612
|c HDS-LEE-20190612
|x 4
536 _ _ |a PhD no Grant - Doktorand ohne besondere Förderung (PHD-NO-GRANT-20170405)
|0 G:(DE-Juel1)PHD-NO-GRANT-20170405
|c PHD-NO-GRANT-20170405
|x 5
536 _ _ |a HAF - Helmholtz Analytics Framework (ZT-I-0003)
|0 G:(DE-HGF)ZT-I-0003
|c ZT-I-0003
|x 6
700 1 _ |a Krajsek, Kai
|0 P:(DE-Juel1)129347
|b 1
|u fzj
700 1 _ |a Diaz, Sandra
|0 P:(DE-Juel1)165859
|b 2
|u fzj
700 1 _ |a Herty, Michael
|0 P:(DE-HGF)0
|b 3
773 _ _ |a 10.1007/978-3-030-64580-9_7
856 4 _ |u https://juser.fz-juelich.de/record/889208/files/main.pdf
|y OpenAccess
909 C O |o oai:juser.fz-juelich.de:889208
|p openaire
|p open_access
|p VDB
|p driver
|p dnbdelivery
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 0
|6 P:(DE-Juel1)161462
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 1
|6 P:(DE-Juel1)129347
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 2
|6 P:(DE-Juel1)165859
913 1 _ |a DE-HGF
|b Key Technologies
|l Supercomputing & Big Data
|1 G:(DE-HGF)POF3-510
|0 G:(DE-HGF)POF3-511
|3 G:(DE-HGF)POF3
|2 G:(DE-HGF)POF3-500
|4 G:(DE-HGF)POF
|v Computational Science and Mathematical Methods
|x 0
914 1 _ |y 2020
915 _ _ |a OpenAccess
|0 StatID:(DE-HGF)0510
|2 StatID
920 _ _ |l yes
920 1 _ |0 I:(DE-Juel1)JSC-20090406
|k JSC
|l Jülich Supercomputing Center
|x 0
980 _ _ |a contrib
980 _ _ |a VDB
980 _ _ |a UNRESTRICTED
980 _ _ |a contb
980 _ _ |a I:(DE-Juel1)JSC-20090406
980 1 _ |a FullTexts


LibraryCollectionCLSMajorCLSMinorLanguageAuthor
Marc 21