001     1031786
005     20250317091735.0
024 7 _ |a 10.1109/SBAC-PAD63648.2024.00023
|2 doi
024 7 _ |a 10.34734/FZJ-2024-05813
|2 datacite_doi
037 _ _ |a FZJ-2024-05813
041 _ _ |a English
100 1 _ |a Maloney, Samuel
|0 P:(DE-Juel1)200390
|b 0
|e Corresponding author
|u fzj
111 2 _ |a 2024 IEEE 36th International Symposium on Computer Architecture and High Performance Computing
|g SBAC-PAD
|c Hilo, HI
|d 2024-11-13 - 2024-11-15
|w USA
245 _ _ |a Analyzing HPC Monitoring Data With a View Towards Efficient Resource Utilization
260 _ _ |c 2024
|b IEEE
295 1 0 |a 2024 IEEE 36th International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)
300 _ _ |a 170-181
336 7 _ |a CONFERENCE_PAPER
|2 ORCID
336 7 _ |a Conference Paper
|0 33
|2 EndNote
336 7 _ |a Journal Article
|0 PUB:(DE-HGF)16
|2 PUB:(DE-HGF)
|m journal
336 7 _ |a INPROCEEDINGS
|2 BibTeX
336 7 _ |a conferenceObject
|2 DRIVER
336 7 _ |a Output Types/Conference Paper
|2 DataCite
336 7 _ |a Contribution to a conference proceedings
|b contrib
|m contrib
|0 PUB:(DE-HGF)8
|s 1736144602_25368
|2 PUB:(DE-HGF)
336 7 _ |a Contribution to a book
|0 PUB:(DE-HGF)7
|2 PUB:(DE-HGF)
|m contb
500 _ _ |a The data used for this study are available at: https://doi.org/10.26165/JUELICH-DATA/BDFBPQ 979-8-3503-5616-8/24/$31.00 © 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
520 _ _ |a Compute nodes in modern HPC systems are growing in size and their hardware has become ever more diverse. Still, many HPC centers allocate the resources of full nodes exclusively to avoid contention, despite the associated risk of underutilization. This paper describes a thorough resource utilization study of CPU and GPU compute and memory capacity, and interconnect bandwidth on JUWELS, a mature leadership-class modular supercomputer, with the aim of identifying opportunities for improving utilization through advanced scheduling and node sharing. Separate analysis of CPU-only and GPU-accelerated nodes finds that CPU compute usage is already close to optimal for the CPU-only nodes, whereas there is plenty of scope for co-scheduling CPU-based jobs on GPU-accelerated nodes. Memory capacity and node-level interconnect bandwidth are sufficient to provision co-scheduled jobs. We analyze multiple one-month datasets to validate robustness of conclusions over time and compare with previous studies on other systems to establish generalizability of results.
536 _ _ |a 5112 - Cross-Domain Algorithms, Tools, Methods Labs (ATMLs) and Research Groups (POF4-511)
|0 G:(DE-HGF)POF4-5112
|c POF4-511
|f POF IV
|x 0
536 _ _ |a 5122 - Future Computing & Big Data Systems (POF4-512)
|0 G:(DE-HGF)POF4-5122
|c POF4-512
|f POF IV
|x 1
536 _ _ |a DEEP-SEA - DEEP – SOFTWARE FOR EXASCALE ARCHITECTURES (955606)
|0 G:(EU-Grant)955606
|c 955606
|f H2020-JTI-EuroHPC-2019-1
|x 2
536 _ _ |0 G:(DE-Juel-1)ATMLAO
|a ATMLAO - ATML Application Optimization and User Service Tools (ATMLAO)
|c ATMLAO
|x 3
588 _ _ |a Dataset connected to CrossRef Conference
700 1 _ |a Suarez, Estela
|0 P:(DE-Juel1)142361
|b 1
|u fzj
700 1 _ |a Eicker, Norbert
|0 P:(DE-Juel1)132090
|b 2
|u fzj
700 1 _ |a Guimaraes, Filipe
|0 P:(DE-Juel1)162225
|b 3
|u fzj
700 1 _ |a Frings, Wolfgang
|0 P:(DE-Juel1)132108
|b 4
|u fzj
773 _ _ |a 10.1109/SBAC-PAD63648.2024.00023
|p 170-181
|t 2643-3001
|y 2024
856 4 _ |y OpenAccess
|u https://juser.fz-juelich.de/record/1031786/files/Maloney2024-postprint.pdf
856 4 _ |y Restricted
|u https://juser.fz-juelich.de/record/1031786/files/SBAC-PAD-24-presentation.pdf
909 C O |o oai:juser.fz-juelich.de:1031786
|p openaire
|p open_access
|p driver
|p VDB
|p ec_fundedresources
|p dnbdelivery
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 0
|6 P:(DE-Juel1)200390
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 1
|6 P:(DE-Juel1)142361
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 2
|6 P:(DE-Juel1)132090
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 3
|6 P:(DE-Juel1)162225
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 4
|6 P:(DE-Juel1)132108
913 1 _ |a DE-HGF
|b Key Technologies
|l Engineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action
|1 G:(DE-HGF)POF4-510
|0 G:(DE-HGF)POF4-511
|3 G:(DE-HGF)POF4
|2 G:(DE-HGF)POF4-500
|4 G:(DE-HGF)POF
|v Enabling Computational- & Data-Intensive Science and Engineering
|9 G:(DE-HGF)POF4-5112
|x 0
913 1 _ |a DE-HGF
|b Key Technologies
|l Engineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action
|1 G:(DE-HGF)POF4-510
|0 G:(DE-HGF)POF4-512
|3 G:(DE-HGF)POF4
|2 G:(DE-HGF)POF4-500
|4 G:(DE-HGF)POF
|v Supercomputing & Big Data Infrastructures
|9 G:(DE-HGF)POF4-5122
|x 1
914 1 _ |y 2024
915 _ _ |a OpenAccess
|0 StatID:(DE-HGF)0510
|2 StatID
920 _ _ |l yes
920 1 _ |0 I:(DE-Juel1)JSC-20090406
|k JSC
|l Jülich Supercomputing Center
|x 0
980 _ _ |a contrib
980 _ _ |a VDB
980 _ _ |a UNRESTRICTED
980 _ _ |a journal
980 _ _ |a contb
980 _ _ |a I:(DE-Juel1)JSC-20090406
980 1 _ |a FullTexts


LibraryCollectionCLSMajorCLSMinorLanguageAuthor
Marc 21