Home > Publications database > Job Scheduling in High Performance Computing Systems with Disaggregated Memory Resources |
Contribution to a conference proceedings | FZJ-2024-06434 |
; ; ; ; ; ;
2024
IEEE
This record in other databases:
Please use a persistent id in citations: doi:10.1109/CLUSTER59578.2024.00033 doi:10.34734/FZJ-2024-06434
Abstract: Disaggregated memory promises to meet growing memory requirements of applications while improving system resource utilization in high-performance computing (HPC) systems. Compared to traditional systems—where expensive resources such as CPUs, GPUs, and memory, are assigned to jobs in units of nodes—systems with disaggregated memory introduce memory pools that can be shared among jobs; this introduces new optimization metrics to the job scheduler. In this paper, we propose a data-driven approach to evaluate job scheduling and resource configuration in HPC systems with disaggregated memory. To incorporate the memory requirements of jobs for both local and disaggregated memory resources and improve system efficiency in open-science HPC systems, we introduce a novel job scheduling algorithm called FM (Fair Memory). Our simulation results show that FM outperforms commonly-used job schedulers in terms of jobs’ bounded slowdown when the shared memory pool capacity is limited, and in terms of fairness under all conditions.
![]() |
The record appears in these collections: |