Application Porting and Optimization on GPU-Accelerated POWER Architectures

Herten, Andreas; Ravindar, Archana; Papatheodore, Tom; Pleiter, Dirk; Hagleitner, Christoph; Wagner, Mathias

Items
Marc 21

001			868213
005			20210130004152.0
037	_	_	\|a FZJ-2019-06785
041	_	_	\|a English
100	1	_	\|a Herten, Andreas \|0 P:(DE-Juel1)145478 \|b 0 \|e Corresponding author \|u fzj
111	2	_	\|a International Conference for High Performance Computing, Networking, Storage and Analysis (The Supercomputing Conference) \|g SC19 \|c Denver, CO \|d 2019-11-18 - 2019-11-18 \|w USA
245	_	_	\|a Application Porting and Optimization on GPU-Accelerated POWER Architectures
260	_	_	\|c 2019
336	7	_	\|a lecture \|2 DRIVER
336	7	_	\|a Generic \|0 31 \|2 EndNote
336	7	_	\|a MISC \|2 BibTeX
336	7	_	\|a Lecture \|b lecture \|m lecture \|0 PUB:(DE-HGF)17 \|s 1576758136_22301 \|2 PUB:(DE-HGF) \|x After Call
336	7	_	\|a LECTURE_SPEECH \|2 ORCID
336	7	_	\|a Text \|2 DataCite
520	_	_	\|a The POWER processor has re-emerged as a technology for supercomputer architectures. One major reason is the tight integration of processor and GPU accelerator through the NVLink technology. Two major sites in the US, ORNL and LLNL, deployed their pre-exascale systems based on this new architecture (Summit and Sierra, respectively).This tutorial gives an opportunity to obtain in-depth knowledge and experience with GPU-accelerated POWER nodes. It focuses on porting applications to a single node and covers the topics architecture, compilers, performance analysis and tuning, and multi-GPU programming. The tutorial includes an overview of the NVLink-based node architectures, lectures on first-hand experience in porting to this architecture, and exercises using tools to focus on performance.
536	_	_	\|a 511 - Computational Science and Mathematical Methods (POF3-511) \|0 G:(DE-HGF)POF3-511 \|c POF3-511 \|f POF III \|x 0
536	_	_	\|a 513 - Supercomputer Facility (POF3-513) \|0 G:(DE-HGF)POF3-513 \|c POF3-513 \|f POF III \|x 1
700	1	_	\|a Pleiter, Dirk \|0 P:(DE-Juel1)144441 \|b 1 \|u fzj
700	1	_	\|a Wagner, Mathias \|0 P:(DE-HGF)0 \|b 2
700	1	_	\|a Hagleitner, Christoph \|0 P:(DE-HGF)0 \|b 3
700	1	_	\|a Ravindar, Archana \|0 P:(DE-HGF)0 \|b 4
700	1	_	\|a Papatheodore, Tom \|0 P:(DE-HGF)0 \|b 5
856	4	_	\|u https://juser.fz-juelich.de/record/868213/files/1-Hardware_Architecture.pdf \|y Restricted
856	4	_	\|u https://juser.fz-juelich.de/record/868213/files/2-Performance_Counters.pdf \|y Restricted
856	4	_	\|u https://juser.fz-juelich.de/record/868213/files/3-Compiler_Optimizations.pdf \|y Restricted
856	4	_	\|u https://juser.fz-juelich.de/record/868213/files/4-Volta_GPU_Architecture.pdf \|y Restricted
856	4	_	\|u https://juser.fz-juelich.de/record/868213/files/5-Multi_GPU_Programming.pdf \|y Restricted
856	4	_	\|u https://juser.fz-juelich.de/record/868213/files/6-Best_Practices.pdf \|y Restricted
856	4	_	\|u https://juser.fz-juelich.de/record/868213/files/1-Hardware_Architecture.pdf?subformat=pdfa \|x pdfa \|y Restricted
856	4	_	\|u https://juser.fz-juelich.de/record/868213/files/2-Performance_Counters.pdf?subformat=pdfa \|x pdfa \|y Restricted
856	4	_	\|u https://juser.fz-juelich.de/record/868213/files/3-Compiler_Optimizations.pdf?subformat=pdfa \|x pdfa \|y Restricted
856	4	_	\|u https://juser.fz-juelich.de/record/868213/files/6-Best_Practices.pdf?subformat=pdfa \|x pdfa \|y Restricted
909	C	O	\|o oai:juser.fz-juelich.de:868213 \|p VDB
910	1	_	\|a Forschungszentrum Jülich \|0 I:(DE-588b)5008462-8 \|k FZJ \|b 0 \|6 P:(DE-Juel1)145478
910	1	_	\|a Forschungszentrum Jülich \|0 I:(DE-588b)5008462-8 \|k FZJ \|b 1 \|6 P:(DE-Juel1)144441
913	1	_	\|a DE-HGF \|b Key Technologies \|1 G:(DE-HGF)POF3-510 \|0 G:(DE-HGF)POF3-511 \|2 G:(DE-HGF)POF3-500 \|v Computational Science and Mathematical Methods \|x 0 \|4 G:(DE-HGF)POF \|3 G:(DE-HGF)POF3 \|l Supercomputing & Big Data
913	1	_	\|a DE-HGF \|b Key Technologies \|1 G:(DE-HGF)POF3-510 \|0 G:(DE-HGF)POF3-513 \|2 G:(DE-HGF)POF3-500 \|v Supercomputer Facility \|x 1 \|4 G:(DE-HGF)POF \|3 G:(DE-HGF)POF3 \|l Supercomputing & Big Data
914	1	_	\|y 2019
920	_	_	\|l yes
920	1	_	\|0 I:(DE-Juel1)JSC-20090406 \|k JSC \|l Jülich Supercomputing Center \|x 0
980	_	_	\|a lecture
980	_	_	\|a VDB
980	_	_	\|a I:(DE-Juel1)JSC-20090406
980	_	_	\|a UNRESTRICTED

Library	Collection	CLSMajor	CLSMinor	Language	Author

Marc 21

guest :: login JuSER
		Search		Submit		Personalize Your alerts Your baskets Your searches		Help