TY  - CONF
AU  - Guimarães, Filipe Souza Mendes
AU  - Sankaran, Aravind
AU  - Frings, Wolfgang
TI  - Supporting HPC Users with LLview; 1
VL  - 16091
SN  - 0302-9743
CY  - Heidelberg
PB  - Springer
M1  - FZJ-2025-05008
SN  - 978-3-032-07611-3 (print)
T2  - Lecture Notes in Computer Science
SP  - 40 - 51
PY  - 2025
AB  - Diagnosing and reporting operational issues to optimise system usage and performance is challenging on large-scale HPC systems due to their sheer complexity. At the Jülich Supercomputing Centre (JSC), we address this challenge with LLview, an open-source system and job reporting framework. LLview provides near real-time metrics for analysis through a web portal with role-based access for users, administrators, and support staff. In this paper, we present a series of use cases demonstrating how LLview enables efficient diagnosis and resolution of system and application issues, enhancing both reactive and proactive support for HPC users.
T2  - 40th International Conference on High Performance Computing, ISC High Performance 2025.
CY  - 10 Jun 2025 - 13 Jun 2025, Hamburg (Germany)
Y2  - 10 Jun 2025 - 13 Jun 2025
M2  - Hamburg, Germany
LB  - PUB:(DE-HGF)8 ; PUB:(DE-HGF)7
DO  - DOI:10.1007/978-3-032-07612-0
UR  - https://juser.fz-juelich.de/record/1048907
ER  -