Lecture (Other) FZJ-2025-05597

http://join2-wiki.gsi.de/foswiki/pub/Main/Artwork/join2_logo100x88.png
GPU Programming Part 2: Advanced GPU Programming

 ;  ;  ;  ;  ;

2025

Lecture at JSC - as part of the Training Programme of Forschungszentrum Jülich (Jülich / online, Germany), 7 Jul 2025 - 11 Jul 20252025-07-072025-07-11

Abstract: GPU-accelerated computing drives current scientific research. Writing fast numeric algorithms for GPUs offers high application performance by offloading compute-intensive portions of the code to a GPU.This advanced course consists of modules providing more in-depth coverage of multi-GPU programming, modern CUDA concepts, CUDA Fortran, and portable programming models such as OpenACC and C++ parallel STL algorithms.Topics covered will include A) Advanced Multi-GPU Programming with MPI B) Advanced Multi-GPU Programming with NCCL and NVSHMEM C) Advanced and Modern CUDA Concepts (Cooperative Groups, CUDA Graphs, CUB Primitives, Modern C++ Programming) D) Kokkos E) GPU Programming with Abstractions (OpenACC, Standard Language Programming (pSTL))


Contributing Institute(s):
  1. Jülich Supercomputing Center (JSC)
Research Program(s):
  1. 5111 - Domain-Specific Simulation & Data Life Cycle Labs (SDLs) and Research Groups (POF4-511) (POF4-511)
  2. 5112 - Cross-Domain Algorithms, Tools, Methods Labs (ATMLs) and Research Groups (POF4-511) (POF4-511)
  3. 5122 - Future Computing & Big Data Systems (POF4-512) (POF4-512)
  4. ATML-X-DEV - ATML Accelerating Devices (ATML-X-DEV) (ATML-X-DEV)
  5. BMBF 01 1H1 6013, NRW 325 – 8.03 – 133340 - SiVeGCS (DB001492) (DB001492)

Appears in the scientific report 2025
Click to display QR Code for this record

The record appears in these collections:
Document types > Presentations > Lectures
Workflow collections > Public records
Institute Collections > JSC
Online First

 Record created 2025-12-18, last modified 2026-01-07


Restricted:
05_Modern_C++ - Download fulltext PDF
06_CUB - Download fulltext PDF
04_CUDA_Graphs - Download fulltext PDF
08_OpenACC - Download fulltext PDF
03_Cooperative_Groups - Download fulltext PDF
09_C++_pSTL - Download fulltext PDF
10_Kokkos - Download fulltext PDF
07_CUDA_Fortran - Download fulltext PDF
01_MPI - Download fulltext PDF
02_NCCL_NVSHMEM - Download fulltext PDF
External link:
Download fulltextFulltext
Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)