Short Paper: Accelerating Hyperparameter Optimization Algorithms with Mixed Precision

Aach, Marcel; Sarma, Rakesh; Inanc, Eray; Riedel, Morris; Lintermann, Andreas

doi:10.1145/3624062.3624259

Items
Marc 21

001			1018062
005			20231121201850.0
024	7	_	\|a 10.1145/3624062.3624259 \|2 doi
024	7	_	\|a 10.34734/FZJ-2023-04518 \|2 datacite_doi
037	_	_	\|a FZJ-2023-04518
100	1	_	\|a Aach, Marcel \|0 P:(DE-Juel1)180916 \|b 0 \|e Corresponding author \|u fzj
111	2	_	\|a SC-W 2023: Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis \|g SC 2023 \|c Denver, CO \|d 2023-11-12 - 2023-11-17 \|w USA
245	_	_	\|a Short Paper: Accelerating Hyperparameter Optimization Algorithms with Mixed Precision
260	_	_	\|c 2023 \|b ACM New York, NY, USA
300	_	_	\|a 1776–1779
336	7	_	\|a CONFERENCE_PAPER \|2 ORCID
336	7	_	\|a Conference Paper \|0 33 \|2 EndNote
336	7	_	\|a INPROCEEDINGS \|2 BibTeX
336	7	_	\|a conferenceObject \|2 DRIVER
336	7	_	\|a Output Types/Conference Paper \|2 DataCite
336	7	_	\|a Contribution to a conference proceedings \|b contrib \|m contrib \|0 PUB:(DE-HGF)8 \|s 1700553840_7160 \|2 PUB:(DE-HGF)
520	_	_	\|a Hyperparameter Optimization (HPO) of Neural Networks (NNs) is a computationally expensive procedure. On accelerators, such as NVIDIA Graphics Processing Units (GPUs) equipped with Tensor Cores, it is possible to speed-up the NN training by reducing the precision of some of the NN parameters, also referred to as mixed precision training. This paper investigates the performance of three popular HPO algorithms in terms of the achieved speed-up and model accuracy, utilizing early stopping, Bayesian, and genetic optimization approaches, in combination with mixed precision functionalities. The benchmarks are performed on 64 GPUs in parallel on three datasets: two from the vision and one from the Computational Fluid Dynamics domain. The results show that larger speed-ups can be achieved for mixed compared to full precision HPO if the checkpoint frequency is kept low. In addition to the reduced runtime, small gains in generalization performance on the test set are observed.
536	_	_	\|a 5111 - Domain-Specific Simulation & Data Life Cycle Labs (SDLs) and Research Groups (POF4-511) \|0 G:(DE-HGF)POF4-5111 \|c POF4-511 \|f POF IV \|x 0
536	_	_	\|a RAISE - Research on AI- and Simulation-Based Engineering at Exascale (951733) \|0 G:(EU-Grant)951733 \|c 951733 \|f H2020-INFRAEDI-2019-1 \|x 1
588	_	_	\|a Dataset connected to CrossRef Conference
700	1	_	\|a Sarma, Rakesh \|0 P:(DE-Juel1)188513 \|b 1
700	1	_	\|a Inanc, Eray \|0 P:(DE-Juel1)188268 \|b 2
700	1	_	\|a Riedel, Morris \|0 P:(DE-Juel1)132239 \|b 3 \|u fzj
700	1	_	\|a Lintermann, Andreas \|0 P:(DE-Juel1)165948 \|b 4
773	_	_	\|a 10.1145/3624062.3624259
856	4	_	\|u https://juser.fz-juelich.de/record/1018062/files/FZJ-2023-04518.pdf \|y OpenAccess
909	C	O	\|o oai:juser.fz-juelich.de:1018062 \|p openaire \|p open_access \|p driver \|p VDB \|p ec_fundedresources \|p dnbdelivery
910	1	_	\|a Forschungszentrum Jülich \|0 I:(DE-588b)5008462-8 \|k FZJ \|b 0 \|6 P:(DE-Juel1)180916
910	1	_	\|a University of Iceland \|0 I:(DE-HGF)0 \|b 0 \|6 P:(DE-Juel1)180916
910	1	_	\|a Forschungszentrum Jülich \|0 I:(DE-588b)5008462-8 \|k FZJ \|b 1 \|6 P:(DE-Juel1)188513
910	1	_	\|a Forschungszentrum Jülich \|0 I:(DE-588b)5008462-8 \|k FZJ \|b 2 \|6 P:(DE-Juel1)188268
910	1	_	\|a Forschungszentrum Jülich \|0 I:(DE-588b)5008462-8 \|k FZJ \|b 3 \|6 P:(DE-Juel1)132239
910	1	_	\|a University of Iceland \|0 I:(DE-HGF)0 \|b 3 \|6 P:(DE-Juel1)132239
910	1	_	\|a Forschungszentrum Jülich \|0 I:(DE-588b)5008462-8 \|k FZJ \|b 4 \|6 P:(DE-Juel1)165948
913	1	_	\|a DE-HGF \|b Key Technologies \|l Engineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action \|1 G:(DE-HGF)POF4-510 \|0 G:(DE-HGF)POF4-511 \|3 G:(DE-HGF)POF4 \|2 G:(DE-HGF)POF4-500 \|4 G:(DE-HGF)POF \|v Enabling Computational- & Data-Intensive Science and Engineering \|9 G:(DE-HGF)POF4-5111 \|x 0
914	1	_	\|y 2023
915	_	_	\|a OpenAccess \|0 StatID:(DE-HGF)0510 \|2 StatID
920	1	_	\|0 I:(DE-Juel1)JSC-20090406 \|k JSC \|l Jülich Supercomputing Center \|x 0
980	_	_	\|a contrib
980	_	_	\|a VDB
980	_	_	\|a UNRESTRICTED
980	_	_	\|a I:(DE-Juel1)JSC-20090406
980	1	_	\|a FullTexts

Library	Collection	CLSMajor	CLSMinor	Language	Author

Marc 21

Gast :: Anmelden JuSER
		Suchen		Absenden		Personalisieren Ihre Benachrichtigungen Ihre Körbe Ihre Suchanfragen		Hilfe