001034807 001__ 1034807 001034807 005__ 20250822121412.0 001034807 0247_ $$2datacite_doi$$a10.34734/FZJ-2024-07560 001034807 037__ $$aFZJ-2024-07560 001034807 1001_ $$0P:(DE-Juel1)145478$$aHerten, Andreas$$b0$$eCorresponding author$$ufzj 001034807 1112_ $$aJSC - as part of the Training Programme of Forschungszentrum Jülich$$cJülich$$d2024-04-08 - 2024-04-10$$wGermany 001034807 245__ $$aGPU Programming Part 1: Foundations 001034807 260__ $$c2024 001034807 3367_ $$2DRIVER$$alecture 001034807 3367_ $$031$$2EndNote$$aGeneric 001034807 3367_ $$2BibTeX$$aMISC 001034807 3367_ $$0PUB:(DE-HGF)17$$2PUB:(DE-HGF)$$aLecture$$blecture$$mlecture$$s1737529395_26572$$xOther 001034807 3367_ $$2ORCID$$aLECTURE_SPEECH 001034807 3367_ $$2DataCite$$aText 001034807 520__ $$aGPU-accelerated computing drives current scientific research. Writing fast numeric algorithms for GPUs offers high application performance by offloading compute-intensive portions of the code to a GPU.The course will cover aspects of GPU architectures and programming. Focus is on the usage of the parallel programming language CUDA C++, which allows maximum control of NVIDIA GPU hardware. Examples of increasing complexity are used to demonstrate optimization and tuning of scientific applications.This course is a basic course covering the foundations of GPU programming including an introduction to GPU/parallel computing, programming with CUDA, GPU libraries, tools for debugging and profiling, and performance optimizations.Topics covered will include Introduction to GPUs and GPU computing, programming model CUDA, tools for debugging and profiling, GPU libraries (like cuBLAS, cuFFT), and introduction to nulti-GPU programming. 001034807 536__ $$0G:(DE-HGF)POF4-5111$$a5111 - Domain-Specific Simulation & Data Life Cycle Labs (SDLs) and Research Groups (POF4-511)$$cPOF4-511$$fPOF IV$$x0 001034807 536__ $$0G:(DE-HGF)POF4-5112$$a5112 - Cross-Domain Algorithms, Tools, Methods Labs (ATMLs) and Research Groups (POF4-511)$$cPOF4-511$$fPOF IV$$x1 001034807 536__ $$0G:(DE-HGF)POF4-5122$$a5122 - Future Computing & Big Data Systems (POF4-512)$$cPOF4-512$$fPOF IV$$x2 001034807 536__ $$0G:(DE-Juel-1)ATML-X-DEV$$aATML-X-DEV - ATML Accelerating Devices (ATML-X-DEV)$$cATML-X-DEV$$x3 001034807 7001_ $$0P:(DE-Juel1)132189$$aMeinke, Jan$$b1$$eCorresponding author$$ufzj 001034807 7001_ $$0P:(DE-Juel1)176293$$aHaghighi Mood, Kaveh$$b2$$ufzj 001034807 7001_ $$0P:(DE-Juel1)137023$$aKraus, Jiri$$b3$$ufzj 001034807 7001_ $$0P:(DE-Juel1)180799$$aHrywniak, Markus$$b4$$ufzj 001034807 8564_ $$uhttps://indico3-jsc.fz-juelich.de/event/158/ 001034807 8564_ $$uhttps://juser.fz-juelich.de/record/1034807/files/02_cuda_tools_mhrywniak.pdf$$yOpenAccess 001034807 8564_ $$uhttps://juser.fz-juelich.de/record/1034807/files/04_cuda_transpose_mhrywniak.pdf$$yOpenAccess 001034807 8564_ $$uhttps://juser.fz-juelich.de/record/1034807/files/3-Matrix_Multiplication.pdf$$yOpenAccess 001034807 8564_ $$uhttps://juser.fz-juelich.de/record/1034807/files/CUDA_Streams_and_Events.pdf$$yOpenAccess 001034807 8564_ $$uhttps://juser.fz-juelich.de/record/1034807/files/Multi_GPU_Programming_with_MPI_and_CUDA.pdf$$yOpenAccess 001034807 8564_ $$uhttps://juser.fz-juelich.de/record/1034807/files/aherten-cuda-intro.pdf$$yOpenAccess 001034807 909CO $$ooai:juser.fz-juelich.de:1034807$$pdriver$$pVDB$$popen_access$$popenaire 001034807 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)145478$$aForschungszentrum Jülich$$b0$$kFZJ 001034807 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)132189$$aForschungszentrum Jülich$$b1$$kFZJ 001034807 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)176293$$aForschungszentrum Jülich$$b2$$kFZJ 001034807 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)137023$$aForschungszentrum Jülich$$b3$$kFZJ 001034807 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)180799$$aForschungszentrum Jülich$$b4$$kFZJ 001034807 9131_ $$0G:(DE-HGF)POF4-511$$1G:(DE-HGF)POF4-510$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5111$$aDE-HGF$$bKey Technologies$$lEngineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action$$vEnabling Computational- & Data-Intensive Science and Engineering$$x0 001034807 9131_ $$0G:(DE-HGF)POF4-511$$1G:(DE-HGF)POF4-510$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5112$$aDE-HGF$$bKey Technologies$$lEngineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action$$vEnabling Computational- & Data-Intensive Science and Engineering$$x1 001034807 9131_ $$0G:(DE-HGF)POF4-512$$1G:(DE-HGF)POF4-510$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5122$$aDE-HGF$$bKey Technologies$$lEngineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action$$vSupercomputing & Big Data Infrastructures$$x2 001034807 9141_ $$y2024 001034807 915__ $$0StatID:(DE-HGF)0510$$2StatID$$aOpenAccess 001034807 920__ $$lyes 001034807 9201_ $$0I:(DE-Juel1)JSC-20090406$$kJSC$$lJülich Supercomputing Center$$x0 001034807 980__ $$alecture 001034807 980__ $$aVDB 001034807 980__ $$aUNRESTRICTED 001034807 980__ $$aI:(DE-Juel1)JSC-20090406 001034807 9801_ $$aFullTexts