001     811713
005     20250314084114.0
024 7 _ |a 10.1145/2934661
|2 doi
037 _ _ |a FZJ-2016-04097
041 _ _ |a English
100 1 _ |a Böhme, David
|0 P:(DE-HGF)0
|b 0
|e Corresponding author
245 _ _ |a Identifying the Root Causes of Wait States in Large-Scale Parallel Applications
260 _ _ |a New York, NY
|c 2016
|b acm Association for Computing Machinery
336 7 _ |a article
|2 DRIVER
336 7 _ |a Output Types/Journal article
|2 DataCite
336 7 _ |a Journal Article
|b journal
|m journal
|0 PUB:(DE-HGF)16
|s 1470211850_15478
|2 PUB:(DE-HGF)
336 7 _ |a ARTICLE
|2 BibTeX
336 7 _ |a JOURNAL_ARTICLE
|2 ORCID
336 7 _ |a Journal Article
|0 0
|2 EndNote
520 _ _ |a Driven by growing application requirements and accelerated by current trends in microprocessor design, the number of processor cores on modern supercomputers is increasing from generation to generation. However, load or communication imbalance prevents many codes from taking advantage of the available parallelism, as delays of single processes may spread wait states across the entire machine. Moreover, when employing complex point-to-point communication patterns, wait states may propagate along far-reaching cause-effect chains that are hard to track manually and that complicate an assessment of the actual costs of an imbalance. Building on earlier work by Meira Jr. et al., we present a scalable approach that identifies program wait states and attributes their costs in terms of resource waste to their original cause. By replaying event traces in parallel both forward and backward, we can identify the processes and call paths responsible for the most severe imbalances even for runs with hundreds of thousands of processes.
536 _ _ |a 511 - Computational Science and Mathematical Methods (POF3-511)
|0 G:(DE-HGF)POF3-511
|c POF3-511
|f POF III
|x 0
536 _ _ |0 G:(DE-Juel-1)ATMLPP
|a ATMLPP - ATML Parallel Performance (ATMLPP)
|c ATMLPP
|x 1
588 _ _ |a Dataset connected to CrossRef
700 1 _ |a Geimer, Markus
|0 P:(DE-Juel1)132112
|b 1
|u fzj
700 1 _ |a Arnold, Lukas
|0 P:(DE-Juel1)132044
|b 2
|u fzj
700 1 _ |a Voigtlaender, Felix
|0 P:(DE-HGF)0
|b 3
700 1 _ |a Wolf, Felix
|0 P:(DE-HGF)0
|b 4
773 _ _ |a 10.1145/2934661
|g Vol. 3, no. 2, p. 1 - 24
|0 PERI:(DE-600)2845845-X
|n 2
|p 11
|t ACM Transactions on Parallel Computing
|v 3
|y 2016
|x 2374-0353
856 4 _ |u https://juser.fz-juelich.de/record/811713/files/TOPC-201607-03-02-11.pdf
|y Restricted
856 4 _ |u https://juser.fz-juelich.de/record/811713/files/TOPC-201607-03-02-11.gif?subformat=icon
|x icon
|y Restricted
856 4 _ |u https://juser.fz-juelich.de/record/811713/files/TOPC-201607-03-02-11.jpg?subformat=icon-1440
|x icon-1440
|y Restricted
856 4 _ |u https://juser.fz-juelich.de/record/811713/files/TOPC-201607-03-02-11.jpg?subformat=icon-180
|x icon-180
|y Restricted
856 4 _ |u https://juser.fz-juelich.de/record/811713/files/TOPC-201607-03-02-11.jpg?subformat=icon-640
|x icon-640
|y Restricted
909 C O |o oai:juser.fz-juelich.de:811713
|p VDB
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 1
|6 P:(DE-Juel1)132112
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 2
|6 P:(DE-Juel1)132044
913 1 _ |a DE-HGF
|b Key Technologies
|1 G:(DE-HGF)POF3-510
|0 G:(DE-HGF)POF3-511
|2 G:(DE-HGF)POF3-500
|v Computational Science and Mathematical Methods
|x 0
|4 G:(DE-HGF)POF
|3 G:(DE-HGF)POF3
|l Supercomputing & Big Data
914 1 _ |y 2016
920 _ _ |l yes
920 1 _ |0 I:(DE-Juel1)JSC-20090406
|k JSC
|l Jülich Supercomputing Center
|x 0
980 _ _ |a journal
980 _ _ |a VDB
980 _ _ |a I:(DE-Juel1)JSC-20090406
980 _ _ |a UNRESTRICTED


LibraryCollectionCLSMajorCLSMinorLanguageAuthor
Marc 21