001     830159
005     20250314084116.0
024 7 _ |a 10.1007/978-3-319-56702-0_6
|2 doi
037 _ _ |a FZJ-2017-03736
041 _ _ |a English
100 1 _ |a Hermanns, Marc-André
|0 P:(DE-Juel1)168253
|b 0
|e Corresponding author
|u fzj
111 2 _ |a 10th International Workshop on Parallel Tools for High Performance Computing
|g IPTW'16
|c Stuttgart
|d 2016-10-04 - 2016-10-05
|w Germany
245 _ _ |a Trace-Based Detection of Lock Contention in MPI One-Sided Communication
260 _ _ |a Cham
|c 2017
|b Springer International Publishing
295 1 0 |a Tools for High Performance Computing 2016 / Niethammer, Christoph (Editor) ; Cham : Springer International Publishing, 2017, Chapter 6 ; ISBN: 978-3-319-56701-3
300 _ _ |a 97-114
336 7 _ |a Contribution to a conference proceedings
|0 PUB:(DE-HGF)8
|2 PUB:(DE-HGF)
|m contrib
336 7 _ |a BOOK_CHAPTER
|2 ORCID
336 7 _ |a Book Section
|0 7
|2 EndNote
336 7 _ |a bookPart
|2 DRIVER
336 7 _ |a INBOOK
|2 BibTeX
336 7 _ |a Output Types/Book chapter
|2 DataCite
336 7 _ |a Contribution to a book
|b contb
|m contb
|0 PUB:(DE-HGF)7
|s 1495455418_25481
|2 PUB:(DE-HGF)
520 _ _ |a Performance analysis is an essential part of the development process of HPC applications. Thus, developers need adequate tools to evaluate design and implementation decisions to effectively develop efficient parallel applications. Therefore, it is crucial that tools provide an as complete support as possible for the available language and library features to ensure that design decisions are not negatively influenced by the level of available tool support. The message passing interface (MPI) supports three basic communication paradigms: point-to-point, collective, and one-sided. Each of these targets and excels at a specific application scenario. While current performance tools support the first two quite well, one-sided communication is often neglected. In our earlier work, we were able to reduce this gap by showing how wait states in MPI one-sided communication using active-target synchronization can be detected at large scale using our trace-based message replay technique. Further extending our work on the detection of progress-related wait states in ARMCI, this paper presents an improved infrastructure that is capable of not only detecting progress-related wait states, but also wait states due to lock contention in MPI passive-target synchronization. We present an event-based definition of lock contention, the trace-based algorithm to detect it, as well as initial results with a micro-benchmark and an application kernel scaling up to 65,536 processes.
536 _ _ |a 511 - Computational Science and Mathematical Methods (POF3-511)
|0 G:(DE-HGF)POF3-511
|c POF3-511
|f POF III
|x 0
536 _ _ |0 G:(DE-Juel-1)ATMLPP
|a ATMLPP - ATML Parallel Performance (ATMLPP)
|c ATMLPP
|x 1
588 _ _ |a Dataset connected to CrossRef Book
700 1 _ |a Geimer, Markus
|0 P:(DE-Juel1)132112
|b 1
|u fzj
700 1 _ |a Mohr, Bernd
|0 P:(DE-Juel1)132199
|b 2
|u fzj
700 1 _ |a Wolf, Felix
|0 P:(DE-HGF)0
|b 3
773 _ _ |a 10.1007/978-3-319-56702-0_6
909 C O |o oai:juser.fz-juelich.de:830159
|p VDB
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 0
|6 P:(DE-Juel1)168253
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 1
|6 P:(DE-Juel1)132112
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 2
|6 P:(DE-Juel1)132199
910 1 _ |a TU Darmstadt
|0 I:(DE-HGF)0
|b 3
|6 P:(DE-Juel1)132299
913 1 _ |a DE-HGF
|b Key Technologies
|1 G:(DE-HGF)POF3-510
|0 G:(DE-HGF)POF3-511
|2 G:(DE-HGF)POF3-500
|v Computational Science and Mathematical Methods
|x 0
|4 G:(DE-HGF)POF
|3 G:(DE-HGF)POF3
|l Supercomputing & Big Data
914 1 _ |y 2017
920 _ _ |l yes
920 1 _ |0 I:(DE-82)080012_20140620
|k JARA-HPC
|l JARA - HPC
|x 0
920 1 _ |0 I:(DE-Juel1)JSC-20090406
|k JSC
|l Jülich Supercomputing Center
|x 1
980 _ _ |a contb
980 _ _ |a VDB
980 _ _ |a contrib
980 _ _ |a I:(DE-82)080012_20140620
980 _ _ |a I:(DE-Juel1)JSC-20090406
980 _ _ |a UNRESTRICTED


LibraryCollectionCLSMajorCLSMinorLanguageAuthor
Marc 21