001043684 001__ 1043684
001043684 005__ 20250724210255.0
001043684 020__ $$a978-9935-9807-8-6
001043684 0247_ $$2datacite_doi$$a10.34734/FZJ-2025-02982
001043684 037__ $$aFZJ-2025-02982
001043684 1001_ $$0P:(DE-Juel1)180916$$aAach, Marcel$$b0$$eCorresponding author$$ufzj
001043684 245__ $$aParallel and Scalable Hyperparameter Optimization for Distributed Deep Learning Methods on High-Performance Computing Systems$$f - 2025-01-29
001043684 260__ $$c2025
001043684 300__ $$a172p
001043684 3367_ $$2DataCite$$aOutput Types/Dissertation
001043684 3367_ $$0PUB:(DE-HGF)3$$2PUB:(DE-HGF)$$aBook$$mbook
001043684 3367_ $$2ORCID$$aDISSERTATION
001043684 3367_ $$2BibTeX$$aPHDTHESIS
001043684 3367_ $$02$$2EndNote$$aThesis
001043684 3367_ $$0PUB:(DE-HGF)11$$2PUB:(DE-HGF)$$aDissertation / PhD Thesis$$bphd$$mphd$$s1753357716_13675
001043684 3367_ $$2DRIVER$$adoctoralThesis
001043684 500__ $$aAdditional Grant: Verbundprojekt: NXTAIM - NXT GEN (01.01.2024-31.12.2026)
001043684 502__ $$aDissertation, University of Iceland, 2025$$bDissertation$$cUniversity of Iceland$$d2025$$o2025-01-29
001043684 520__ $$aThe design of Deep Learning (DL) models is a complex task, involving decisions on the general architecture of the model (e.g., the number of layers of the Neural Network (NN)) and on the optimization algorithms (e.g., the learning rate). These so-called hyperparameters significantly influence the performance (e.g., accuracy or error rates) of the final DL model and are, therefore, of great importance. However, optimizing these hyperparameters is a computationally intensive process due to the necessity of evaluating many combinations to identify the best-performing ones. Often, the optimization is manually performed. This Ph.D. thesis leverages the power of High-Performance Computing (HPC) systems to perform automatic and efficient Hyperparameter Optimization (HPO) for DL models that are trained on large quantities of scientific data. On modern HPO systems, equipped with a high number of Graphics Processing Units (GPUs), it becomes possible to not only evaluate multiple models with different hyperparameter combinations in parallel but also to distribute the training of the models themselves to multiple GPUs. State-of-the-art HPO methods, based on the concepts of early stopping, have demonstrated significant reductions in the runtime of the HPO process. Their performance at scale, particularly in the context of HPC environments and when applied to large scientific datasets, has remained unexplored. This thesis thus researches parallel and scalable HPO methods that leverage new inherent capabilities of HPC systems and innovative workflows incorporating novel computing paradigms. The developed HPO methods are validated on different scientific datasets ranging from the Computational Fluid Dynamics (CFD) to Remote Sensing (RS) domain, spanning multiple hundred Gigabytes (GBs) to several Terabytes (TBs) in size.
001043684 536__ $$0G:(DE-HGF)POF4-5111$$a5111 - Domain-Specific Simulation & Data Life Cycle Labs (SDLs) and Research Groups (POF4-511)$$cPOF4-511$$fPOF IV$$x0
001043684 536__ $$0G:(EU-Grant)951733$$aRAISE - Research on AI- and Simulation-Based Engineering at Exascale (951733)$$c951733$$fH2020-INFRAEDI-2019-1$$x1
001043684 536__ $$0G:(BMWK)19A23014l$$anxtAIM - nxtAIM – NXT GEN AI Methods (19A23014l)$$c19A23014l$$x2
001043684 8564_ $$u//juser.fz-juelich.de/record/1043684/files/PhD_Thesis_Marcel_Aach.pdf
001043684 8564_ $$uhttps://juser.fz-juelich.de/record/1043684/files/PhD_Thesis_Marcel_Aach.pdf$$yOpenAccess
001043684 909CO $$ooai:juser.fz-juelich.de:1043684$$popenaire$$popen_access$$pdriver$$pVDB$$pec_fundedresources$$pdnbdelivery
001043684 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)180916$$aForschungszentrum Jülich$$b0$$kFZJ
001043684 9131_ $$0G:(DE-HGF)POF4-511$$1G:(DE-HGF)POF4-510$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5111$$aDE-HGF$$bKey Technologies$$lEngineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action$$vEnabling Computational- & Data-Intensive Science and Engineering$$x0
001043684 9141_ $$y2025
001043684 915__ $$0StatID:(DE-HGF)0510$$2StatID$$aOpenAccess
001043684 920__ $$lyes
001043684 9201_ $$0I:(DE-Juel1)JSC-20090406$$kJSC$$lJülich Supercomputing Center$$x0
001043684 980__ $$aphd
001043684 980__ $$aVDB
001043684 980__ $$aUNRESTRICTED
001043684 980__ $$abook
001043684 980__ $$aI:(DE-Juel1)JSC-20090406
001043684 9801_ $$aFullTexts