Contribution to a conference proceedings FZJ-2024-06434

http://join2-wiki.gsi.de/foswiki/pub/Main/Artwork/join2_logo100x88.png
Job Scheduling in High Performance Computing Systems with Disaggregated Memory Resources

 ;  ;  ;  ;  ;  ;

2024
IEEE

2024 IEEE International Conference on Cluster Computing, CLUSTER, KobeKobe, Japan, 24 Sep 2024 - 27 Sep 20242024-09-242024-09-27 IEEE 297-309 () [10.1109/CLUSTER59578.2024.00033]

This record in other databases:

Please use a persistent id in citations: doi:  doi:

Abstract: Disaggregated memory promises to meet growing memory requirements of applications while improving system resource utilization in high-performance computing (HPC) systems. Compared to traditional systems—where expensive resources such as CPUs, GPUs, and memory, are assigned to jobs in units of nodes—systems with disaggregated memory introduce memory pools that can be shared among jobs; this introduces new optimization metrics to the job scheduler. In this paper, we propose a data-driven approach to evaluate job scheduling and resource configuration in HPC systems with disaggregated memory. To incorporate the memory requirements of jobs for both local and disaggregated memory resources and improve system efficiency in open-science HPC systems, we introduce a novel job scheduling algorithm called FM (Fair Memory). Our simulation results show that FM outperforms commonly-used job schedulers in terms of jobs’ bounded slowdown when the shared memory pool capacity is limited, and in terms of fairness under all conditions.


Contributing Institute(s):
  1. Jülich Supercomputing Center (JSC)
Research Program(s):
  1. 5122 - Future Computing & Big Data Systems (POF4-512) (POF4-512)
  2. DEEP-SEA - DEEP – SOFTWARE FOR EXASCALE ARCHITECTURES (955606) (955606)

Appears in the scientific report 2024
Database coverage:
OpenAccess
Click to display QR Code for this record

The record appears in these collections:
Dokumenttypen > Ereignisse > Beiträge zu Proceedings
Workflowsammlungen > Öffentliche Einträge
Institutssammlungen > JSC
Publikationsdatenbank
Open Access

 Datensatz erzeugt am 2024-11-26, letzte Änderung am 2025-02-04


OpenAccess:
Volltext herunterladen PDF
Dieses Dokument bewerten:

Rate this document:
1
2
3
 
(Bisher nicht rezensiert)