Home > Publications database > Malleability in Modern HPC Systems: Current Experiences, Challenges, and Future Opportunities |
Journal Article | FZJ-2025-00107 |
; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ; ;
2024
IEEE
New York, NY
This record in other databases:
Please use a persistent id in citations: doi:10.1109/TPDS.2024.3406764 doi:10.34734/FZJ-2025-00107
Abstract: With the increase of complex scientific simulations driven by workflows and heterogeneous workload profiles, managing system resources effectively is essential for improving performance and system throughput, especially due to trends like heterogeneous HPC and deeply integrated systems with on-chip accelerators. For optimal resource utilization, dynamic resource allocation can improve productivity across all system and application levels, by adapting the applications’ configurations to the system's resources. In this context, malleable jobs, which can change resources at runtime, can increase the system throughput and resource utilization while bringing various advantages for HPC users (e.g., shorter waiting time). Malleability has received much attention recently, even though it has been an active research area for more than two decades. This article presents the state-of-the-art of malleable implementations in HPC systems, targeting mainly malleability in compute and I/O resources. Based on our experiences, we state our current concerns and list future opportunities for research.
![]() |
The record appears in these collections: |