001     1033553
005     20250204215538.0
024 7 _ |a 10.1109/CLUSTER59578.2024.00033
|2 doi
024 7 _ |a 10.34734/FZJ-2024-06434
|2 datacite_doi
037 _ _ |a FZJ-2024-06434
041 _ _ |a English
100 1 _ |a Li, Jie
|0 P:(DE-HGF)0
|b 0
|e Corresponding author
111 2 _ |a 2024 IEEE International Conference on Cluster Computing
|g CLUSTER
|c Kobe
|d 2024-09-24 - 2024-09-27
|w Japan
245 _ _ |a Job Scheduling in High Performance Computing Systems with Disaggregated Memory Resources
260 _ _ |c 2024
|b IEEE
300 _ _ |a 297-309
336 7 _ |a CONFERENCE_PAPER
|2 ORCID
336 7 _ |a Conference Paper
|0 33
|2 EndNote
336 7 _ |a INPROCEEDINGS
|2 BibTeX
336 7 _ |a conferenceObject
|2 DRIVER
336 7 _ |a Output Types/Conference Paper
|2 DataCite
336 7 _ |a Contribution to a conference proceedings
|b contrib
|m contrib
|0 PUB:(DE-HGF)8
|s 1736147305_25368
|2 PUB:(DE-HGF)
520 _ _ |a Disaggregated memory promises to meet growing memory requirements of applications while improving system resource utilization in high-performance computing (HPC) systems. Compared to traditional systems—where expensive resources such as CPUs, GPUs, and memory, are assigned to jobs in units of nodes—systems with disaggregated memory introduce memory pools that can be shared among jobs; this introduces new optimization metrics to the job scheduler. In this paper, we propose a data-driven approach to evaluate job scheduling and resource configuration in HPC systems with disaggregated memory. To incorporate the memory requirements of jobs for both local and disaggregated memory resources and improve system efficiency in open-science HPC systems, we introduce a novel job scheduling algorithm called FM (Fair Memory). Our simulation results show that FM outperforms commonly-used job schedulers in terms of jobs’ bounded slowdown when the shared memory pool capacity is limited, and in terms of fairness under all conditions.
536 _ _ |a 5122 - Future Computing & Big Data Systems (POF4-512)
|0 G:(DE-HGF)POF4-5122
|c POF4-512
|f POF IV
|x 0
536 _ _ |a DEEP-SEA - DEEP – SOFTWARE FOR EXASCALE ARCHITECTURES (955606)
|0 G:(EU-Grant)955606
|c 955606
|f H2020-JTI-EuroHPC-2019-1
|x 1
588 _ _ |a Dataset connected to CrossRef Conference
700 1 _ |a Michelogiannakis, George
|0 P:(DE-HGF)0
|b 1
|e Corresponding author
700 1 _ |a Maloney, Samuel
|0 P:(DE-Juel1)200390
|b 2
|u fzj
700 1 _ |a Cook, Brandon
|0 P:(DE-HGF)0
|b 3
700 1 _ |a Suarez, Estela
|0 P:(DE-Juel1)142361
|b 4
|u fzj
700 1 _ |a Shalf, John
|0 P:(DE-HGF)0
|b 5
700 1 _ |a Chen, Yong
|0 P:(DE-HGF)0
|b 6
773 _ _ |a 10.1109/CLUSTER59578.2024.00033
|p 297-309
|y 2024
856 4 _ |u https://juser.fz-juelich.de/record/1033553/files/li2024_accepted_article.pdf
|y OpenAccess
909 C O |o oai:juser.fz-juelich.de:1033553
|p openaire
|p open_access
|p driver
|p VDB
|p ec_fundedresources
|p dnbdelivery
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 2
|6 P:(DE-Juel1)200390
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 4
|6 P:(DE-Juel1)142361
913 1 _ |a DE-HGF
|b Key Technologies
|l Engineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action
|1 G:(DE-HGF)POF4-510
|0 G:(DE-HGF)POF4-512
|3 G:(DE-HGF)POF4
|2 G:(DE-HGF)POF4-500
|4 G:(DE-HGF)POF
|v Supercomputing & Big Data Infrastructures
|9 G:(DE-HGF)POF4-5122
|x 0
914 1 _ |y 2024
915 _ _ |a OpenAccess
|0 StatID:(DE-HGF)0510
|2 StatID
920 _ _ |l yes
920 1 _ |0 I:(DE-Juel1)JSC-20090406
|k JSC
|l Jülich Supercomputing Center
|x 0
980 _ _ |a contrib
980 _ _ |a VDB
980 _ _ |a UNRESTRICTED
980 _ _ |a I:(DE-Juel1)JSC-20090406
980 1 _ |a FullTexts


LibraryCollectionCLSMajorCLSMinorLanguageAuthor
Marc 21