000903617 001__ 903617 000903617 005__ 20250822121408.0 000903617 0247_ $$2Handle$$a2128/29491 000903617 037__ $$aFZJ-2021-05271 000903617 041__ $$aEnglish 000903617 1001_ $$0P:(DE-Juel1)145478$$aHerten, Andreas$$b0$$eCorresponding author$$ufzj 000903617 1112_ $$d2021-04-26 - 2021-04-30 000903617 245__ $$aPRACE Training Course: GPU Programming with CUDA 000903617 260__ $$c2021 000903617 3367_ $$2DRIVER$$alecture 000903617 3367_ $$031$$2EndNote$$aGeneric 000903617 3367_ $$2BibTeX$$aMISC 000903617 3367_ $$0PUB:(DE-HGF)17$$2PUB:(DE-HGF)$$aLecture$$blecture$$mlecture$$s1639657214_31592$$xOther 000903617 3367_ $$2ORCID$$aLECTURE_SPEECH 000903617 3367_ $$2DataCite$$aText 000903617 500__ $$aOnline course within the PRACE and FZJ training program. 000903617 520__ $$aGPU-accelerated computing drives current scientific research. Writing fast numeric algorithms for GPUs offers high application performance by offloading compute-intensive portions of the code to an NVIDIA GPU. The course will cover basic aspects of GPU architectures and programming. Focus is on the usage of the parallel programming language CUDA-C which allows maximum control of NVIDIA GPU hardware. Examples of increasing complexity are used to demonstrate optimization and tuning of scientific applications.Topics covered include: Introduction to GPU/Parallel computing Programming model CUDA GPU libraries like CuBLAS and CuFFT Tools for debugging and profiling Performance optimizationsThis course is a PRACE training course. 000903617 536__ $$0G:(DE-HGF)POF4-5121$$a5121 - Supercomputing & Big Data Facilities (POF4-512)$$cPOF4-512$$fPOF IV$$x0 000903617 536__ $$0G:(DE-HGF)POF4-5122$$a5122 - Future Computing & Big Data Systems (POF4-512)$$cPOF4-512$$fPOF IV$$x1 000903617 536__ $$0G:(DE-HGF)POF4-5111$$a5111 - Domain-Specific Simulation & Data Life Cycle Labs (SDLs) and Research Groups (POF4-511)$$cPOF4-511$$fPOF IV$$x2 000903617 536__ $$0G:(DE-Juel-1)ATML-X-DEV$$aATML-X-DEV - ATML Accelerating Devices (ATML-X-DEV)$$cATML-X-DEV$$x3 000903617 7001_ $$0P:(DE-Juel1)132189$$aMeinke, Jan$$b1$$eCorresponding author$$ufzj 000903617 7001_ $$0P:(DE-Juel1)176293$$aHaghighi Mood, Kaveh$$b2$$ufzj 000903617 7001_ $$0P:(DE-Juel1)180799$$aHrywniak, Markus$$b3$$ufzj 000903617 7001_ $$0P:(DE-Juel1)137023$$aKraus, Jiri$$b4$$ufzj 000903617 8564_ $$uhttps://juser.fz-juelich.de/record/903617/files/00-Overview.pdf$$yOpenAccess 000903617 8564_ $$uhttps://juser.fz-juelich.de/record/903617/files/01-CUDA-Intro.pdf$$yOpenAccess 000903617 8564_ $$uhttps://juser.fz-juelich.de/record/903617/files/02-CUDA-Tools.pdf$$yOpenAccess 000903617 8564_ $$uhttps://juser.fz-juelich.de/record/903617/files/03-Matrix-Mul.pdf$$yOpenAccess 000903617 8564_ $$uhttps://juser.fz-juelich.de/record/903617/files/04-Performance-Opt.pdf$$yOpenAccess 000903617 8564_ $$uhttps://juser.fz-juelich.de/record/903617/files/05-Multi-GPU.pdf$$yOpenAccess 000903617 8564_ $$uhttps://juser.fz-juelich.de/record/903617/files/06-CUDA-Streams-Events.pdf$$yOpenAccess 000903617 8564_ $$uhttps://juser.fz-juelich.de/record/903617/files/07-Matrix-Mul-Tiled.pdf$$yOpenAccess 000903617 8564_ $$uhttps://juser.fz-juelich.de/record/903617/files/08-Cooperative-Groups.pdf$$yOpenAccess 000903617 8564_ $$uhttps://juser.fz-juelich.de/record/903617/files/09-CUDA-C%2B%2B.pdf$$yOpenAccess 000903617 8564_ $$uhttps://juser.fz-juelich.de/record/903617/files/10-CUB.pdf$$yOpenAccess 000903617 8564_ $$uhttps://juser.fz-juelich.de/record/903617/files/11-CUDA-Fortran.pdf$$yOpenAccess 000903617 909CO $$ooai:juser.fz-juelich.de:903617$$pdriver$$pVDB$$popen_access$$popenaire 000903617 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)145478$$aForschungszentrum Jülich$$b0$$kFZJ 000903617 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)132189$$aForschungszentrum Jülich$$b1$$kFZJ 000903617 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)176293$$aForschungszentrum Jülich$$b2$$kFZJ 000903617 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)180799$$aForschungszentrum Jülich$$b3$$kFZJ 000903617 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)137023$$aForschungszentrum Jülich$$b4$$kFZJ 000903617 9131_ $$0G:(DE-HGF)POF4-512$$1G:(DE-HGF)POF4-510$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5121$$aDE-HGF$$bKey Technologies$$lEngineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action$$vSupercomputing & Big Data Infrastructures$$x0 000903617 9131_ $$0G:(DE-HGF)POF4-512$$1G:(DE-HGF)POF4-510$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5122$$aDE-HGF$$bKey Technologies$$lEngineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action$$vSupercomputing & Big Data Infrastructures$$x1 000903617 9131_ $$0G:(DE-HGF)POF4-511$$1G:(DE-HGF)POF4-510$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5111$$aDE-HGF$$bKey Technologies$$lEngineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action$$vEnabling Computational- & Data-Intensive Science and Engineering$$x2 000903617 9141_ $$y2021 000903617 915__ $$0StatID:(DE-HGF)0510$$2StatID$$aOpenAccess 000903617 9201_ $$0I:(DE-Juel1)JSC-20090406$$kJSC$$lJülich Supercomputing Center$$x0 000903617 980__ $$alecture 000903617 980__ $$aVDB 000903617 980__ $$aUNRESTRICTED 000903617 980__ $$aI:(DE-Juel1)JSC-20090406 000903617 9801_ $$aFullTexts