001049836 001__ 1049836 001049836 005__ 20260108204824.0 001049836 037__ $$aFZJ-2025-05597 001049836 1001_ $$0P:(DE-Juel1)132189$$aMeinke, Jan$$b0$$eCorresponding author 001049836 1112_ $$aJSC - as part of the Training Programme of Forschungszentrum Jülich$$cJülich / online$$d2025-07-07 - 2025-07-11$$wGermany 001049836 245__ $$aGPU Programming Part 2: Advanced GPU Programming 001049836 260__ $$c2025 001049836 3367_ $$2DRIVER$$alecture 001049836 3367_ $$031$$2EndNote$$aGeneric 001049836 3367_ $$2BibTeX$$aMISC 001049836 3367_ $$0PUB:(DE-HGF)17$$2PUB:(DE-HGF)$$aLecture$$blecture$$mlecture$$s1767814127_10642$$xOther 001049836 3367_ $$2ORCID$$aLECTURE_SPEECH 001049836 3367_ $$2DataCite$$aText 001049836 520__ $$aGPU-accelerated computing drives current scientific research. Writing fast numeric algorithms for GPUs offers high application performance by offloading compute-intensive portions of the code to a GPU.This advanced course consists of modules providing more in-depth coverage of multi-GPU programming, modern CUDA concepts, CUDA Fortran, and portable programming models such as OpenACC and C++ parallel STL algorithms.Topics covered will include A) Advanced Multi-GPU Programming with MPI B) Advanced Multi-GPU Programming with NCCL and NVSHMEM C) Advanced and Modern CUDA Concepts (Cooperative Groups, CUDA Graphs, CUB Primitives, Modern C++ Programming) D) Kokkos E) GPU Programming with Abstractions (OpenACC, Standard Language Programming (pSTL)) 001049836 536__ $$0G:(DE-HGF)POF4-5111$$a5111 - Domain-Specific Simulation & Data Life Cycle Labs (SDLs) and Research Groups (POF4-511)$$cPOF4-511$$fPOF IV$$x0 001049836 536__ $$0G:(DE-HGF)POF4-5112$$a5112 - Cross-Domain Algorithms, Tools, Methods Labs (ATMLs) and Research Groups (POF4-511)$$cPOF4-511$$fPOF IV$$x1 001049836 536__ $$0G:(DE-HGF)POF4-5122$$a5122 - Future Computing & Big Data Systems (POF4-512)$$cPOF4-512$$fPOF IV$$x2 001049836 536__ $$0G:(DE-Juel-1)ATML-X-DEV$$aATML-X-DEV - ATML Accelerating Devices (ATML-X-DEV)$$cATML-X-DEV$$x3 001049836 536__ $$0G:(DE-Juel-1)DB001492$$aBMBF 01 1H1 6013, NRW 325 – 8.03 – 133340 - SiVeGCS (DB001492)$$cDB001492$$x4 001049836 7001_ $$0P:(DE-Juel1)145478$$aHerten, Andreas$$b1$$eCorresponding author 001049836 7001_ $$0P:(DE-Juel1)180799$$aHrywniak, Markus$$b2 001049836 7001_ $$0P:(DE-Juel1)164813$$aBadwaik, Jayesh$$b3 001049836 7001_ $$0P:(DE-Juel1)176293$$aHaghighi Mood, Kaveh$$b4 001049836 7001_ $$0P:(DE-Juel1)208747$$aMorgenstern, Laura$$b5$$ufzj 001049836 8564_ $$uhttps://indico3-jsc.fz-juelich.de/event/235/ 001049836 8564_ $$uhttps://juser.fz-juelich.de/record/1049836/files/01_MPI.pdf$$yRestricted 001049836 8564_ $$uhttps://juser.fz-juelich.de/record/1049836/files/02_NCCL_NVSHMEM.pdf$$yRestricted 001049836 8564_ $$uhttps://juser.fz-juelich.de/record/1049836/files/03_Cooperative_Groups.pdf$$yRestricted 001049836 8564_ $$uhttps://juser.fz-juelich.de/record/1049836/files/04_CUDA_Graphs.pdf$$yRestricted 001049836 8564_ $$uhttps://juser.fz-juelich.de/record/1049836/files/05_Modern_C%2B%2B.pdf$$yRestricted 001049836 8564_ $$uhttps://juser.fz-juelich.de/record/1049836/files/06_CUB.pdf$$yRestricted 001049836 8564_ $$uhttps://juser.fz-juelich.de/record/1049836/files/07_CUDA_Fortran.pdf$$yRestricted 001049836 8564_ $$uhttps://juser.fz-juelich.de/record/1049836/files/08_OpenACC.pdf$$yRestricted 001049836 8564_ $$uhttps://juser.fz-juelich.de/record/1049836/files/09_C%2B%2B_pSTL.pdf$$yRestricted 001049836 8564_ $$uhttps://juser.fz-juelich.de/record/1049836/files/10_Kokkos.pdf$$yRestricted 001049836 909CO $$ooai:juser.fz-juelich.de:1049836$$pVDB 001049836 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)132189$$aForschungszentrum Jülich$$b0$$kFZJ 001049836 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)145478$$aForschungszentrum Jülich$$b1$$kFZJ 001049836 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)180799$$aForschungszentrum Jülich$$b2$$kFZJ 001049836 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)164813$$aForschungszentrum Jülich$$b3$$kFZJ 001049836 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)176293$$aForschungszentrum Jülich$$b4$$kFZJ 001049836 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)208747$$aForschungszentrum Jülich$$b5$$kFZJ 001049836 9131_ $$0G:(DE-HGF)POF4-511$$1G:(DE-HGF)POF4-510$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5111$$aDE-HGF$$bKey Technologies$$lEngineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action$$vEnabling Computational- & Data-Intensive Science and Engineering$$x0 001049836 9131_ $$0G:(DE-HGF)POF4-511$$1G:(DE-HGF)POF4-510$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5112$$aDE-HGF$$bKey Technologies$$lEngineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action$$vEnabling Computational- & Data-Intensive Science and Engineering$$x1 001049836 9131_ $$0G:(DE-HGF)POF4-512$$1G:(DE-HGF)POF4-510$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5122$$aDE-HGF$$bKey Technologies$$lEngineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action$$vSupercomputing & Big Data Infrastructures$$x2 001049836 9141_ $$y2025 001049836 920__ $$lyes 001049836 9201_ $$0I:(DE-Juel1)JSC-20090406$$kJSC$$lJülich Supercomputing Center$$x0 001049836 980__ $$alecture 001049836 980__ $$aVDB 001049836 980__ $$aI:(DE-Juel1)JSC-20090406 001049836 980__ $$aUNRESTRICTED