Journal Article FZJ-2024-05442

http://join2-wiki.gsi.de/foswiki/pub/Main/Artwork/join2_logo100x88.png
15+ years of joint parallel application performance analysis/tools training with Scalasca/Score-P and Paraver/Extrae toolsets

 ;  ;  ;  ;  ;  ;  ;  ;

2025
Elsevier Science Amsterdam [u.a.]

Future generation computer systems 162, 107472 () [10.1016/j.future.2024.07.050] special issue: "Highlights from the Joint Laboratory on Extreme Scale Computing"

This record in other databases:  

Please use a persistent id in citations: doi:  doi:

Abstract: The diverse landscape of distributed heterogeneous computer systems currently available and being created to address computational challenges with the highest performance requirements presents daunting complexity for application developers. They must effectively decompose and distribute their application functionality and data, efficiently orchestrating the associated communication and synchronisation, on multi/manycore CPU processors with multiple attached acceleration devices structured within compute nodes with interconnection networks of various topologies.Sophisticated compilers, runtime systems and libraries are (loosely) matched with debugging, performance measurement and analysis tools, with proprietary versions by integrators/vendors provided exclusively for their systems complemented by portable (primarily) open-source equivalents developed and supported by the international research community over many years. The Scalasca and Paraver toolsets are two widely employed examples of the latter, installed on personal notebook computers through to the largest leadership HPC systems. Over more than fifteen years their developers have worked closely together in numerous collaborative projects culminating in the creation of a universal parallel performance assessment and optimisation methodology focused on application execution efficiency and scalability, and the associated training and coaching of application developers (often in teams) in its productive use, reviewed in this article with lessons learnt therefrom.

Classification:

Note: Keywords: Hybrid parallel programming; MPI message-passing; OpenMP multithreading; OpenACC device offload acceleration; HPC application execution performance measurement & analysis; Performance assessment & optimisation methodology & tools; Hands-on training & coaching

Contributing Institute(s):
  1. Jülich Supercomputing Center (JSC)
Research Program(s):
  1. 5112 - Cross-Domain Algorithms, Tools, Methods Labs (ATMLs) and Research Groups (POF4-511) (POF4-511)
  2. JLESC - Joint Laboratory for Extreme Scale Computing (JLESC-20150708) (JLESC-20150708)
  3. POP - Performance Optimisation and Productivity (676553) (676553)
  4. POP2 - Performance Optimisation and Productivity 2 (824080) (824080)
  5. POP3 - Performance Optimisation and Productivity 3 (101143931) (101143931)
  6. ATMLPP - ATML Parallel Performance (ATMLPP) (ATMLPP)

Database coverage:
Medline ; Creative Commons Attribution-NonCommercial CC BY-NC 4.0 ; OpenAccess ; Clarivate Analytics Master Journal List ; Current Contents - Engineering, Computing and Technology ; Essential Science Indicators ; IF >= 5 ; JCR ; SCOPUS ; Science Citation Index Expanded ; Web of Science Core Collection
Click to display QR Code for this record

The record appears in these collections:
Dokumenttypen > Aufsätze > Zeitschriftenaufsätze
Workflowsammlungen > Öffentliche Einträge
Workflowsammlungen > Publikationsgebühren
Institutssammlungen > JSC
Publikationsdatenbank
Open Access

 Datensatz erzeugt am 2024-09-09, letzte Änderung am 2025-03-14


OpenAccess:
Volltext herunterladen PDF
Dieses Dokument bewerten:

Rate this document:
1
2
3
 
(Bisher nicht rezensiert)