Trace-Based Detection of Lock Contention in MPI One-Sided Communication

Hermanns, Marc-André; Wolf, Felix; Mohr, Bernd; Geimer, Markus

doi:10.1007/978-3-319-56702-0_6

Contribution to a conference proceedings/Contribution to a book

FZJ-2017-03736

Trace-Based Detection of Lock Contention in MPI One-Sided Communication

Hermanns, M.-A. (Corresponding author)FZJ* ; Geimer, M.FZJ* ; Mohr, B.FZJ* ; Wolf, F.

2017
Springer International Publishing Cham

Tools for High Performance Computing 2016 / Niethammer, Christoph (Editor) ; Cham : Springer International Publishing, 2017, Chapter 6 ; ISBN: 978-3-319-56701-3
10th International Workshop on Parallel Tools for High Performance Computing, IPTW'16, Stuttgart, Germany, 4 Oct 2016 - 5 Oct 2016 Cham : Springer International Publishing 97-114 (2017) [10.1007/978-3-319-56702-0_6]

This record in other databases:

Please use a persistent id in citations: doi:10.1007/978-3-319-56702-0_6

Abstract: Performance analysis is an essential part of the development process of HPC applications. Thus, developers need adequate tools to evaluate design and implementation decisions to effectively develop efficient parallel applications. Therefore, it is crucial that tools provide an as complete support as possible for the available language and library features to ensure that design decisions are not negatively influenced by the level of available tool support. The message passing interface (MPI) supports three basic communication paradigms: point-to-point, collective, and one-sided. Each of these targets and excels at a specific application scenario. While current performance tools support the first two quite well, one-sided communication is often neglected. In our earlier work, we were able to reduce this gap by showing how wait states in MPI one-sided communication using active-target synchronization can be detected at large scale using our trace-based message replay technique. Further extending our work on the detection of progress-related wait states in ARMCI, this paper presents an improved infrastructure that is capable of not only detecting progress-related wait states, but also wait states due to lock contention in MPI passive-target synchronization. We present an event-based definition of lock contention, the trace-based algorithm to detect it, as well as initial results with a micro-benchmark and an application kernel scaling up to 65,536 processes.

Contributing Institute(s):

Research Program(s):

Appears in the scientific report 2017

Click to display QR Code for this record

The record appears in these collections:
Document types > Events > Contributions to a conference proceedings
Document types > Books > Contribution to a book
JARA > JARA > JARA-JARA\-HPC
Workflow collections > Public records
Institute Collections > JSC
Publications database

Record created 2017-05-22, last modified 2025-03-14

Similar records

Rate this document:

(Not yet reviewed)

Add to personal basket
Export as Author List with IDs BibTeX (UTF-8), EndNote XML, EndNote Text, RIS, MARC, Print MARC, MARCXML, DC,
Request correction
Submit fulltext

guest :: login JuSER
		Search		Submit		Personalize Your alerts Your baskets Your searches		Help