001     1043684
005     20260122220257.0
020 _ _ |a 978-9935-9807-8-6
024 7 _ |2 datacite_doi
|a 10.34734/FZJ-2025-02982
037 _ _ |a FZJ-2025-02982
100 1 _ |0 P:(DE-Juel1)180916
|a Aach, Marcel
|b 0
|e Corresponding author
|u fzj
245 _ _ |a Parallel and Scalable Hyperparameter Optimization for Distributed Deep Learning Methods on High-Performance Computing Systems
|f - 2025-01-29
260 _ _ |c 2025
300 _ _ |a 172p
336 7 _ |2 DataCite
|a Output Types/Dissertation
336 7 _ |0 PUB:(DE-HGF)3
|2 PUB:(DE-HGF)
|a Book
|m book
336 7 _ |2 ORCID
|a DISSERTATION
336 7 _ |2 BibTeX
|a PHDTHESIS
336 7 _ |0 2
|2 EndNote
|a Thesis
336 7 _ |0 PUB:(DE-HGF)11
|2 PUB:(DE-HGF)
|a Dissertation / PhD Thesis
|b phd
|m phd
|s 1753357716_13675
336 7 _ |2 DRIVER
|a doctoralThesis
500 _ _ |a Additional Grant: Verbundprojekt: NXTAIM - NXT GEN (01.01.2024-31.12.2026)
502 _ _ |a Dissertation, University of Iceland, 2025
|b Dissertation
|c University of Iceland
|d 2025
|o 2025-01-29
520 _ _ |a The design of Deep Learning (DL) models is a complex task, involving decisions on the general architecture of the model (e.g., the number of layers of the Neural Network (NN)) and on the optimization algorithms (e.g., the learning rate). These so-called hyperparameters significantly influence the performance (e.g., accuracy or error rates) of the final DL model and are, therefore, of great importance. However, optimizing these hyperparameters is a computationally intensive process due to the necessity of evaluating many combinations to identify the best-performing ones. Often, the optimization is manually performed. This Ph.D. thesis leverages the power of High-Performance Computing (HPC) systems to perform automatic and efficient Hyperparameter Optimization (HPO) for DL models that are trained on large quantities of scientific data. On modern HPO systems, equipped with a high number of Graphics Processing Units (GPUs), it becomes possible to not only evaluate multiple models with different hyperparameter combinations in parallel but also to distribute the training of the models themselves to multiple GPUs. State-of-the-art HPO methods, based on the concepts of early stopping, have demonstrated significant reductions in the runtime of the HPO process. Their performance at scale, particularly in the context of HPC environments and when applied to large scientific datasets, has remained unexplored. This thesis thus researches parallel and scalable HPO methods that leverage new inherent capabilities of HPC systems and innovative workflows incorporating novel computing paradigms. The developed HPO methods are validated on different scientific datasets ranging from the Computational Fluid Dynamics (CFD) to Remote Sensing (RS) domain, spanning multiple hundred Gigabytes (GBs) to several Terabytes (TBs) in size.
536 _ _ |0 G:(DE-HGF)POF4-5111
|a 5111 - Domain-Specific Simulation & Data Life Cycle Labs (SDLs) and Research Groups (POF4-511)
|c POF4-511
|f POF IV
|x 0
536 _ _ |0 G:(EU-Grant)951733
|a RAISE - Research on AI- and Simulation-Based Engineering at Exascale (951733)
|c 951733
|f H2020-INFRAEDI-2019-1
|x 1
536 _ _ |0 G:(BMWK)19A23014l
|a nxtAIM - nxtAIM – NXT GEN AI Methods (19A23014l)
|c 19A23014l
|x 2
536 _ _ |0 G:(DE-Juel-1)SDLFSE
|a SDL Fluids & Solids Engineering
|c SDLFSE
|x 3
856 4 _ |u //juser.fz-juelich.de/record/1043684/files/PhD_Thesis_Marcel_Aach.pdf
856 4 _ |u https://juser.fz-juelich.de/record/1043684/files/PhD_Thesis_Marcel_Aach.pdf
|y OpenAccess
909 C O |o oai:juser.fz-juelich.de:1043684
|p openaire
|p open_access
|p driver
|p VDB
|p ec_fundedresources
|p dnbdelivery
910 1 _ |0 I:(DE-588b)5008462-8
|6 P:(DE-Juel1)180916
|a Forschungszentrum Jülich
|b 0
|k FZJ
913 1 _ |0 G:(DE-HGF)POF4-511
|1 G:(DE-HGF)POF4-510
|2 G:(DE-HGF)POF4-500
|3 G:(DE-HGF)POF4
|4 G:(DE-HGF)POF
|9 G:(DE-HGF)POF4-5111
|a DE-HGF
|b Key Technologies
|l Engineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action
|v Enabling Computational- & Data-Intensive Science and Engineering
|x 0
914 1 _ |y 2025
915 _ _ |0 StatID:(DE-HGF)0510
|2 StatID
|a OpenAccess
920 _ _ |l yes
920 1 _ |0 I:(DE-Juel1)JSC-20090406
|k JSC
|l Jülich Supercomputing Center
|x 0
980 _ _ |a phd
980 _ _ |a VDB
980 _ _ |a UNRESTRICTED
980 _ _ |a book
980 _ _ |a I:(DE-Juel1)JSC-20090406
980 1 _ |a FullTexts


LibraryCollectionCLSMajorCLSMinorLanguageAuthor
Marc 21