Talk (non-conference) (Other) FZJ-2023-03546

http://join2-wiki.gsi.de/foswiki/pub/Main/Artwork/join2_logo100x88.png
HPC system and job monitoring with LLview

 ;

2022

RISC2 webinar series, OnlineOnline, Germany, 7 Dec 20222022-12-07 [10.34734/FZJ-2023-03546]

This record in other databases:  

Please use a persistent id in citations: doi:

Abstract: LLview is a monitoring infrastructure developed by the Jülich Supercomputing Centre with the objective to provide an easy to use and adaptable software suite for monitoring High Performance Computing systems. With the emergence of large heterogeneous machines, in the range of Exascale, the challenges of monitoring such huge systems increase significantly. To address that, LLview is under continuous development in order to work for a wide range of hardware systems and software interfaces with negligible overhead and at the same time providing fast, reliable access to job reports, system-wide monitoring data, and real-time system information. That information is provided to system users, project advisors, support teams and system administrators, helping the managing of jobs, identification of performance issues at many levels and also helping the system administrators to find failures and system malfunctions. This webinar gives an overview of the different LLview components and their interaction with each other and the system. Moreover, particular attention is drawn to the system monitoring views and the job reporting features, as they allow to trace the entire life cycle of a job and can help identify problems and bottlenecks at a very early stage.


Contributing Institute(s):
  1. Jülich Supercomputing Center (JSC)
Research Program(s):
  1. 5112 - Cross-Domain Algorithms, Tools, Methods Labs (ATMLs) and Research Groups (POF4-511) (POF4-511)
  2. RISC2 - A network for supporting the coordination of High-Performance Computing research between Europe and Latin America (101016478) (101016478)

Appears in the scientific report 2023
Database coverage:
OpenAccess
Click to display QR Code for this record

The record appears in these collections:
Document types > Presentations > Talks (non-conference)
Workflow collections > Public records
Institute Collections > JSC
Publications database
Open Access

 Record created 2023-09-20, last modified 2024-02-26


OpenAccess:
Download fulltext PDF
External link:
Download fulltextFulltext
Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)