000152041 001__ 152041
000152041 005__ 20250314084110.0
000152041 020__ $$a978-1-61499-380-3
000152041 0247_ $$2doi$$a10.3233/978-1-61499-381-0-773
000152041 0247_ $$2WOS$$aWOS:000452120400078
000152041 037__ $$aFZJ-2014-01861
000152041 1001_ $$0P:(DE-Juel1)142180$$aSchlütter, Marc$$b0$$ufzj
000152041 1112_ $$aInternational Conference on Parallel Computing$$cMunich$$d2013-09-10 - 2013-09-13$$gParCo 2013$$wGermany
000152041 245__ $$aProfiling Hybrid HMPP Applications with Score-P on Heterogeneous Hardware
000152041 260__ $$bIOS Press$$c2014
000152041 29510 $$aParallel Computing: Accelerating Computational Science and Engineering (CSE)
000152041 300__ $$a773 - 782
000152041 3367_ $$0PUB:(DE-HGF)8$$2PUB:(DE-HGF)$$aContribution to a conference proceedings$$bcontrib$$mcontrib$$s1402401709_15788
000152041 3367_ $$0PUB:(DE-HGF)7$$2PUB:(DE-HGF)$$aContribution to a book$$mcontb
000152041 3367_ $$033$$2EndNote$$aConference Paper
000152041 3367_ $$2ORCID$$aCONFERENCE_PAPER
000152041 3367_ $$2DataCite$$aOutput Types/Conference Paper
000152041 3367_ $$2DRIVER$$aconferenceObject
000152041 3367_ $$2BibTeX$$aINPROCEEDINGS
000152041 4900_ $$aAdvances in Parallel Computing$$v25
000152041 520__ $$aIn heterogeneous environments with multi-core systems and accelerators, programming and optimizing large parallel applications turns into a time-intensive and hardware-dependent challenge. To assist application developers in this process, a number of tools and high-level compilers have been developed. Directive-based programming models such as HMPP and OpenACC provide abstractions over low-level GPU programming models,such as CUDA or OpenCL. The compilers developed by CAPS automatically transform the pragma-annotated application code into low-level code, thereby allowing the parallelization and optimization for a given accelerator hardware. To analyze the performance of parallel applications, multiple partners in Germany and the US jointly develop the community measurement infrastructure Score-P. Score-P gathers performance execution profiles, which can be presented and analyzed within the CUBE result browser, and collects detailed event traces to be processed by post-mortem analysis tools such as Scalasca and Vampir.In this paper we present the integration and combined use of Score-P and the CAPS compilers as one approach to efficiently parallelize and optimize codes. Specifically, we describe the PHMPP profiling interface, it's implementation in Score-P, and the presentation of preliminary results in CUBE.
000152041 536__ $$0G:(DE-HGF)POF2-411$$a411 - Computational Science and Mathematical Methods (POF2-411)$$cPOF2-411$$fPOF II$$x0
000152041 536__ $$0G:(DE-Juel-1)ATMLPP$$aATMLPP - ATML Parallel Performance (ATMLPP)$$cATMLPP$$x1
000152041 7001_ $$0P:(DE-Juel1)143710$$aPhilippen, Peter$$b1$$ufzj
000152041 7001_ $$0P:(DE-HGF)0$$aMorin, Laurent$$b2
000152041 7001_ $$0P:(DE-Juel1)132112$$aGeimer, Markus$$b3$$ufzj
000152041 7001_ $$0P:(DE-Juel1)132199$$aMohr, Bernd$$b4$$ufzj
000152041 773__ $$a10.3233/978-1-61499-381-0-773
000152041 8564_ $$uhttps://juser.fz-juelich.de/record/152041/files/FZJ-2014-01861.pdf$$yRestricted
000152041 909CO $$ooai:juser.fz-juelich.de:152041$$pVDB
000152041 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)142180$$aForschungszentrum Jülich GmbH$$b0$$kFZJ
000152041 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)143710$$aForschungszentrum Jülich GmbH$$b1$$kFZJ
000152041 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)132112$$aForschungszentrum Jülich GmbH$$b3$$kFZJ
000152041 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)132199$$aForschungszentrum Jülich GmbH$$b4$$kFZJ
000152041 9132_ $$0G:(DE-HGF)POF3-511$$1G:(DE-HGF)POF3-510$$2G:(DE-HGF)POF3-500$$aDE-HGF$$bKey Technologies$$lSupercomputing & Big Data $$vComputational Science and Mathematical Methods$$x0
000152041 9131_ $$0G:(DE-HGF)POF2-411$$1G:(DE-HGF)POF2-410$$2G:(DE-HGF)POF2-400$$3G:(DE-HGF)POF2$$4G:(DE-HGF)POF$$aDE-HGF$$bSchlüsseltechnologien$$lSupercomputing$$vComputational Science and Mathematical Methods$$x0
000152041 9141_ $$y2014
000152041 9201_ $$0I:(DE-Juel1)JSC-20090406$$kJSC$$lJülich Supercomputing Center$$x0
000152041 980__ $$acontrib
000152041 980__ $$aVDB
000152041 980__ $$acontb
000152041 980__ $$aI:(DE-Juel1)JSC-20090406
000152041 980__ $$aUNRESTRICTED