001     1031485
005     20250314084122.0
024 7 _ |a 10.34734/FZJ-2024-05698
|2 datacite_doi
037 _ _ |a FZJ-2024-05698
041 _ _ |a English
100 1 _ |a Reuter, Jan Andre
|0 P:(DE-Juel1)167509
|b 0
|e Corresponding author
|u fzj
111 2 _ |a 15th International Parallel Tools Workshop 2024
|c Dresden
|d 2024-09-19 - 2024-09-20
|w Germany
245 _ _ |a Score-P and OMPT: Smoothing the bumpy road to OpenMP performance measurement
260 _ _ |c 2024
336 7 _ |a Conference Paper
|0 33
|2 EndNote
336 7 _ |a Other
|2 DataCite
336 7 _ |a INPROCEEDINGS
|2 BibTeX
336 7 _ |a conferenceObject
|2 DRIVER
336 7 _ |a LECTURE_SPEECH
|2 ORCID
336 7 _ |a Conference Presentation
|b conf
|m conf
|0 PUB:(DE-HGF)6
|s 1730714036_29937
|2 PUB:(DE-HGF)
|x After Call
520 _ _ |a The OpenMP API is a widely used interface for high-level parallel programming in C, C++ and Fortran. Initially introduced in 1997, it now targets three basic processor building blocks, CPUs, SIMD vector units, and accelerators. With large adoption in the HPC community and wide support from compiler vendors, OpenMP grew into a key component in leveraging node-level parallelism in applications and frameworks. Herewith, a need for OpenMP-aware performance measurement and analysis tools arose. In version 5.0 of the OpenMP specification, the OpenMP Tools Interface (OMPT) was introduced, providing means to collect precise information about the application's use of OpenMP directives and lock routines. Although provided with a detailed specification, understanding and correctly handling the CPU execution model event sequence dispatched from various vendor's runtimes requires detailed analysis of events, their parameters and executing threads. To facilitate this analysis, we developed a freely available OMPT tool that allows for dumping execution model events and corresponding metadata for post-mortem inspection. Analyzing the output of this tool applied to the official OpenMP examples and handwritten smoke tests, enabled us to implement an OMPT tool for the performance measurement infrastructure Score-P, replacing the long-established, but feature-incomplete source-to-source OpenMP instrumenter OPARI2. Both OMPT tools are regularly tested against the aforementioned OpenMP examples and smoke tests. As vendors take the freedom to interpret the OMPT specification, various checks were developed to detect deviations. In Score-P, deviations are classified as fatal, disengageable, and remediable. Based on feedback given to the vendors, several of the deviations are no longer a concern. Accompanying the development of OMPT itself, the overhead being introduced in the OpenMP runtimes was always a concern. To assess this overhead in various contemporary runtimes, we used the EPCC and SPEC OpenMP benchmark suites, with OMPT disabled (if possible), with a dummy tool, and with the Score-P OMPT tool attached.
536 _ _ |a 5112 - Cross-Domain Algorithms, Tools, Methods Labs (ATMLs) and Research Groups (POF4-511)
|0 G:(DE-HGF)POF4-5112
|c POF4-511
|f POF IV
|x 0
536 _ _ |a BMBF 16ME0630 - ENSIMA - Energieoptimiertes High-Performance Computing für Finite-Elemente-Simulationen in der Produktentwicklung (16ME0630)
|0 G:(BMBF)16ME0630
|c 16ME0630
|x 1
536 _ _ |0 G:(DE-Juel-1)ATMLPP
|a ATMLPP - ATML Parallel Performance (ATMLPP)
|c ATMLPP
|x 2
700 1 _ |a Feld, Christian
|0 P:(DE-Juel1)132244
|b 1
|u fzj
700 1 _ |a Mohr, Bernd
|0 P:(DE-Juel1)132199
|b 2
|u fzj
856 4 _ |u https://tu-dresden.de/zih/die-einrichtung/ressourcen/dateien/toolsworkshop/2024_09_19-JanReuter.pdf
856 4 _ |u https://juser.fz-juelich.de/record/1031485/files/2024_09_19-JanReuter.pdf
|y OpenAccess
856 4 _ |u https://juser.fz-juelich.de/record/1031485/files/2024_09_19-JanReuter.gif?subformat=icon
|x icon
|y OpenAccess
856 4 _ |u https://juser.fz-juelich.de/record/1031485/files/2024_09_19-JanReuter.jpg?subformat=icon-1440
|x icon-1440
|y OpenAccess
856 4 _ |u https://juser.fz-juelich.de/record/1031485/files/2024_09_19-JanReuter.jpg?subformat=icon-180
|x icon-180
|y OpenAccess
856 4 _ |u https://juser.fz-juelich.de/record/1031485/files/2024_09_19-JanReuter.jpg?subformat=icon-640
|x icon-640
|y OpenAccess
909 C O |o oai:juser.fz-juelich.de:1031485
|p openaire
|p open_access
|p VDB
|p driver
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 0
|6 P:(DE-Juel1)167509
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 1
|6 P:(DE-Juel1)132244
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 2
|6 P:(DE-Juel1)132199
913 1 _ |a DE-HGF
|b Key Technologies
|l Engineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action
|1 G:(DE-HGF)POF4-510
|0 G:(DE-HGF)POF4-511
|3 G:(DE-HGF)POF4
|2 G:(DE-HGF)POF4-500
|4 G:(DE-HGF)POF
|v Enabling Computational- & Data-Intensive Science and Engineering
|9 G:(DE-HGF)POF4-5112
|x 0
914 1 _ |y 2024
915 _ _ |a OpenAccess
|0 StatID:(DE-HGF)0510
|2 StatID
920 _ _ |l yes
920 1 _ |0 I:(DE-Juel1)JSC-20090406
|k JSC
|l Jülich Supercomputing Center
|x 0
980 _ _ |a conf
980 _ _ |a VDB
980 _ _ |a UNRESTRICTED
980 _ _ |a I:(DE-Juel1)JSC-20090406
980 1 _ |a FullTexts


LibraryCollectionCLSMajorCLSMinorLanguageAuthor
Marc 21