Profiling Hybrid HMPP Applications with Score-P on Heterogeneous Hardware

Schlütter, Marc; Mohr, Bernd; Morin, Laurent; Philippen, Peter; Geimer, Markus

doi:10.3233/978-1-61499-381-0-773

Items
Marc 21

001			152041
005			20250314084110.0
020	_	_	\|a 978-1-61499-380-3
024	7	_	\|a 10.3233/978-1-61499-381-0-773 \|2 doi
024	7	_	\|a WOS:000452120400078 \|2 WOS
037	_	_	\|a FZJ-2014-01861
100	1	_	\|a Schlütter, Marc \|0 P:(DE-Juel1)142180 \|b 0 \|u fzj
111	2	_	\|a International Conference on Parallel Computing \|g ParCo 2013 \|c Munich \|d 2013-09-10 - 2013-09-13 \|w Germany
245	_	_	\|a Profiling Hybrid HMPP Applications with Score-P on Heterogeneous Hardware
260	_	_	\|c 2014 \|b IOS Press
295	1	0	\|a Parallel Computing: Accelerating Computational Science and Engineering (CSE)
300	_	_	\|a 773 - 782
336	7	_	\|a Contribution to a conference proceedings \|b contrib \|m contrib \|0 PUB:(DE-HGF)8 \|s 1402401709_15788 \|2 PUB:(DE-HGF)
336	7	_	\|a Contribution to a book \|0 PUB:(DE-HGF)7 \|2 PUB:(DE-HGF) \|m contb
336	7	_	\|a Conference Paper \|0 33 \|2 EndNote
336	7	_	\|a CONFERENCE_PAPER \|2 ORCID
336	7	_	\|a Output Types/Conference Paper \|2 DataCite
336	7	_	\|a conferenceObject \|2 DRIVER
336	7	_	\|a INPROCEEDINGS \|2 BibTeX
490	0	_	\|a Advances in Parallel Computing \|v 25
520	_	_	\|a In heterogeneous environments with multi-core systems and accelerators, programming and optimizing large parallel applications turns into a time-intensive and hardware-dependent challenge. To assist application developers in this process, a number of tools and high-level compilers have been developed. Directive-based programming models such as HMPP and OpenACC provide abstractions over low-level GPU programming models,such as CUDA or OpenCL. The compilers developed by CAPS automatically transform the pragma-annotated application code into low-level code, thereby allowing the parallelization and optimization for a given accelerator hardware. To analyze the performance of parallel applications, multiple partners in Germany and the US jointly develop the community measurement infrastructure Score-P. Score-P gathers performance execution profiles, which can be presented and analyzed within the CUBE result browser, and collects detailed event traces to be processed by post-mortem analysis tools such as Scalasca and Vampir.In this paper we present the integration and combined use of Score-P and the CAPS compilers as one approach to efficiently parallelize and optimize codes. Specifically, we describe the PHMPP profiling interface, it's implementation in Score-P, and the presentation of preliminary results in CUBE.
536	_	_	\|a 411 - Computational Science and Mathematical Methods (POF2-411) \|0 G:(DE-HGF)POF2-411 \|c POF2-411 \|f POF II \|x 0
536	_	_	\|0 G:(DE-Juel-1)ATMLPP \|a ATMLPP - ATML Parallel Performance (ATMLPP) \|c ATMLPP \|x 1
700	1	_	\|a Philippen, Peter \|0 P:(DE-Juel1)143710 \|b 1 \|u fzj
700	1	_	\|a Morin, Laurent \|0 P:(DE-HGF)0 \|b 2
700	1	_	\|a Geimer, Markus \|0 P:(DE-Juel1)132112 \|b 3 \|u fzj
700	1	_	\|a Mohr, Bernd \|0 P:(DE-Juel1)132199 \|b 4 \|u fzj
773	_	_	\|a 10.3233/978-1-61499-381-0-773
856	4	_	\|u https://juser.fz-juelich.de/record/152041/files/FZJ-2014-01861.pdf \|y Restricted
909	C	O	\|o oai:juser.fz-juelich.de:152041 \|p VDB
910	1	_	\|a Forschungszentrum Jülich GmbH \|0 I:(DE-588b)5008462-8 \|k FZJ \|b 0 \|6 P:(DE-Juel1)142180
910	1	_	\|a Forschungszentrum Jülich GmbH \|0 I:(DE-588b)5008462-8 \|k FZJ \|b 1 \|6 P:(DE-Juel1)143710
910	1	_	\|a Forschungszentrum Jülich GmbH \|0 I:(DE-588b)5008462-8 \|k FZJ \|b 3 \|6 P:(DE-Juel1)132112
910	1	_	\|a Forschungszentrum Jülich GmbH \|0 I:(DE-588b)5008462-8 \|k FZJ \|b 4 \|6 P:(DE-Juel1)132199
913	2	_	\|a DE-HGF \|b Key Technologies \|l Supercomputing & Big Data \|1 G:(DE-HGF)POF3-510 \|0 G:(DE-HGF)POF3-511 \|2 G:(DE-HGF)POF3-500 \|v Computational Science and Mathematical Methods \|x 0
913	1	_	\|a DE-HGF \|b Schlüsseltechnologien \|l Supercomputing \|1 G:(DE-HGF)POF2-410 \|0 G:(DE-HGF)POF2-411 \|2 G:(DE-HGF)POF2-400 \|v Computational Science and Mathematical Methods \|x 0 \|4 G:(DE-HGF)POF \|3 G:(DE-HGF)POF2
914	1	_	\|y 2014
920	1	_	\|0 I:(DE-Juel1)JSC-20090406 \|k JSC \|l Jülich Supercomputing Center \|x 0
980	_	_	\|a contrib
980	_	_	\|a VDB
980	_	_	\|a contb
980	_	_	\|a I:(DE-Juel1)JSC-20090406
980	_	_	\|a UNRESTRICTED

Library	Collection	CLSMajor	CLSMinor	Language	Author

Marc 21

guest :: login JuSER
		Search		Submit		Personalize Your alerts Your baskets Your searches		Help