001019123 001__ 1019123 001019123 005__ 20250822121411.0 001019123 0247_ $$2datacite_doi$$a10.34734/FZJ-2023-05176 001019123 037__ $$aFZJ-2023-05176 001019123 041__ $$aEnglish 001019123 1001_ $$0P:(DE-Juel1)145478$$aHerten, Andreas$$b0$$eCorresponding author$$ufzj 001019123 1112_ $$aJSC - as part of the Training Programme of Forschungszentrum Jülich$$cJülich$$d2023-04-17 - 2023-04-19$$wGermany 001019123 245__ $$aGPU Programming Part 1: Foundations 001019123 260__ $$c2023 001019123 3367_ $$2DRIVER$$alecture 001019123 3367_ $$031$$2EndNote$$aGeneric 001019123 3367_ $$2BibTeX$$aMISC 001019123 3367_ $$0PUB:(DE-HGF)17$$2PUB:(DE-HGF)$$aLecture$$blecture$$mlecture$$s1702637526_28499$$xOther 001019123 3367_ $$2ORCID$$aLECTURE_SPEECH 001019123 3367_ $$2DataCite$$aText 001019123 520__ $$aGPU-accelerated computing drives current scientific research. Writing fast numeric algorithms for GPUs offers high application performance by offloading compute-intensive portions of the code to a GPU.The course covers aspects of GPU architectures and programming. Focus is on the usage of the parallel programming language CUDA C++, which allows maximum control of NVIDIA GPU hardware. Examples of increasing complexity are used to demonstrate optimization and tuning of scientific applications.For the first time, the GPU Programming with CUDA course is held in two parts. This course is a basic course covering the foundations of GPU programming including an introduction to GPU/parallel computing, programming with CUDA, GPU libraries, tools for debugging and profiling, and performance optimizations.An advanced course with more involved and specific topics is available as an individual entry. 001019123 536__ $$0G:(DE-HGF)POF4-5112$$a5112 - Cross-Domain Algorithms, Tools, Methods Labs (ATMLs) and Research Groups (POF4-511)$$cPOF4-511$$fPOF IV$$x0 001019123 536__ $$0G:(DE-HGF)POF4-5122$$a5122 - Future Computing & Big Data Systems (POF4-512)$$cPOF4-512$$fPOF IV$$x1 001019123 536__ $$0G:(DE-HGF)POF4-5111$$a5111 - Domain-Specific Simulation & Data Life Cycle Labs (SDLs) and Research Groups (POF4-511)$$cPOF4-511$$fPOF IV$$x2 001019123 536__ $$0G:(DE-Juel-1)ATML-X-DEV$$aATML-X-DEV - ATML Accelerating Devices (ATML-X-DEV)$$cATML-X-DEV$$x3 001019123 588__ $$aDataset connected to DataCite 001019123 7001_ $$0P:(DE-Juel1)132189$$aMeinke, Jan$$b1$$eCorresponding author$$ufzj 001019123 7001_ $$0P:(DE-Juel1)176293$$aHaghighi Mood, Kaveh$$b2$$ufzj 001019123 7001_ $$0P:(DE-Juel1)137023$$aKraus, Jiri$$b3$$ufzj 001019123 7001_ $$0P:(DE-Juel1)180799$$aHrywniak, Markus$$b4$$ufzj 001019123 8564_ $$uhttps://indico3-jsc.fz-juelich.de/event/86/ 001019123 8564_ $$uhttps://juser.fz-juelich.de/record/1019123/files/02_cuda_tools_mhrywniak.pdf$$yOpenAccess 001019123 8564_ $$uhttps://juser.fz-juelich.de/record/1019123/files/04_cuda_transpose_mhrywniak.pdf$$yOpenAccess 001019123 8564_ $$uhttps://juser.fz-juelich.de/record/1019123/files/3-Matrix_Multiplication.pdf$$yOpenAccess 001019123 8564_ $$uhttps://juser.fz-juelich.de/record/1019123/files/CUDA_Streams_and_Events.pdf$$yOpenAccess 001019123 8564_ $$uhttps://juser.fz-juelich.de/record/1019123/files/Multi_GPU_Programming_with_MPI_and_CUDA.pdf$$yOpenAccess 001019123 8564_ $$uhttps://juser.fz-juelich.de/record/1019123/files/aherten-cuda-intro.pdf$$yOpenAccess 001019123 8564_ $$uhttps://juser.fz-juelich.de/record/1019123/files/02_cuda_tools_mhrywniak.gif?subformat=icon$$xicon$$yOpenAccess 001019123 8564_ $$uhttps://juser.fz-juelich.de/record/1019123/files/02_cuda_tools_mhrywniak.jpg?subformat=icon-1440$$xicon-1440$$yOpenAccess 001019123 8564_ $$uhttps://juser.fz-juelich.de/record/1019123/files/02_cuda_tools_mhrywniak.jpg?subformat=icon-180$$xicon-180$$yOpenAccess 001019123 8564_ $$uhttps://juser.fz-juelich.de/record/1019123/files/02_cuda_tools_mhrywniak.jpg?subformat=icon-640$$xicon-640$$yOpenAccess 001019123 8564_ $$uhttps://juser.fz-juelich.de/record/1019123/files/04_cuda_transpose_mhrywniak.gif?subformat=icon$$xicon$$yOpenAccess 001019123 8564_ $$uhttps://juser.fz-juelich.de/record/1019123/files/04_cuda_transpose_mhrywniak.jpg?subformat=icon-1440$$xicon-1440$$yOpenAccess 001019123 8564_ $$uhttps://juser.fz-juelich.de/record/1019123/files/04_cuda_transpose_mhrywniak.jpg?subformat=icon-180$$xicon-180$$yOpenAccess 001019123 8564_ $$uhttps://juser.fz-juelich.de/record/1019123/files/04_cuda_transpose_mhrywniak.jpg?subformat=icon-640$$xicon-640$$yOpenAccess 001019123 8564_ $$uhttps://juser.fz-juelich.de/record/1019123/files/3-Matrix_Multiplication.gif?subformat=icon$$xicon$$yOpenAccess 001019123 8564_ $$uhttps://juser.fz-juelich.de/record/1019123/files/3-Matrix_Multiplication.jpg?subformat=icon-1440$$xicon-1440$$yOpenAccess 001019123 8564_ $$uhttps://juser.fz-juelich.de/record/1019123/files/3-Matrix_Multiplication.jpg?subformat=icon-180$$xicon-180$$yOpenAccess 001019123 8564_ $$uhttps://juser.fz-juelich.de/record/1019123/files/3-Matrix_Multiplication.jpg?subformat=icon-640$$xicon-640$$yOpenAccess 001019123 8564_ $$uhttps://juser.fz-juelich.de/record/1019123/files/CUDA_Streams_and_Events.gif?subformat=icon$$xicon$$yOpenAccess 001019123 8564_ $$uhttps://juser.fz-juelich.de/record/1019123/files/CUDA_Streams_and_Events.jpg?subformat=icon-1440$$xicon-1440$$yOpenAccess 001019123 8564_ $$uhttps://juser.fz-juelich.de/record/1019123/files/CUDA_Streams_and_Events.jpg?subformat=icon-180$$xicon-180$$yOpenAccess 001019123 8564_ $$uhttps://juser.fz-juelich.de/record/1019123/files/CUDA_Streams_and_Events.jpg?subformat=icon-640$$xicon-640$$yOpenAccess 001019123 8564_ $$uhttps://juser.fz-juelich.de/record/1019123/files/Multi_GPU_Programming_with_MPI_and_CUDA.gif?subformat=icon$$xicon$$yOpenAccess 001019123 8564_ $$uhttps://juser.fz-juelich.de/record/1019123/files/Multi_GPU_Programming_with_MPI_and_CUDA.jpg?subformat=icon-1440$$xicon-1440$$yOpenAccess 001019123 8564_ $$uhttps://juser.fz-juelich.de/record/1019123/files/Multi_GPU_Programming_with_MPI_and_CUDA.jpg?subformat=icon-180$$xicon-180$$yOpenAccess 001019123 8564_ $$uhttps://juser.fz-juelich.de/record/1019123/files/Multi_GPU_Programming_with_MPI_and_CUDA.jpg?subformat=icon-640$$xicon-640$$yOpenAccess 001019123 8564_ $$uhttps://juser.fz-juelich.de/record/1019123/files/aherten-cuda-intro.gif?subformat=icon$$xicon$$yOpenAccess 001019123 8564_ $$uhttps://juser.fz-juelich.de/record/1019123/files/aherten-cuda-intro.jpg?subformat=icon-1440$$xicon-1440$$yOpenAccess 001019123 8564_ $$uhttps://juser.fz-juelich.de/record/1019123/files/aherten-cuda-intro.jpg?subformat=icon-180$$xicon-180$$yOpenAccess 001019123 8564_ $$uhttps://juser.fz-juelich.de/record/1019123/files/aherten-cuda-intro.jpg?subformat=icon-640$$xicon-640$$yOpenAccess 001019123 909CO $$ooai:juser.fz-juelich.de:1019123$$pdriver$$pVDB$$popen_access$$popenaire 001019123 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)145478$$aForschungszentrum Jülich$$b0$$kFZJ 001019123 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)132189$$aForschungszentrum Jülich$$b1$$kFZJ 001019123 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)176293$$aForschungszentrum Jülich$$b2$$kFZJ 001019123 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)137023$$aForschungszentrum Jülich$$b3$$kFZJ 001019123 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)180799$$aForschungszentrum Jülich$$b4$$kFZJ 001019123 9131_ $$0G:(DE-HGF)POF4-511$$1G:(DE-HGF)POF4-510$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5112$$aDE-HGF$$bKey Technologies$$lEngineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action$$vEnabling Computational- & Data-Intensive Science and Engineering$$x0 001019123 9131_ $$0G:(DE-HGF)POF4-512$$1G:(DE-HGF)POF4-510$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5122$$aDE-HGF$$bKey Technologies$$lEngineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action$$vSupercomputing & Big Data Infrastructures$$x1 001019123 9131_ $$0G:(DE-HGF)POF4-511$$1G:(DE-HGF)POF4-510$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5111$$aDE-HGF$$bKey Technologies$$lEngineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action$$vEnabling Computational- & Data-Intensive Science and Engineering$$x2 001019123 9141_ $$y2023 001019123 915__ $$0StatID:(DE-HGF)0510$$2StatID$$aOpenAccess 001019123 920__ $$lyes 001019123 9201_ $$0I:(DE-Juel1)JSC-20090406$$kJSC$$lJülich Supercomputing Center$$x0 001019123 980__ $$alecture 001019123 980__ $$aVDB 001019123 980__ $$aUNRESTRICTED 001019123 980__ $$aI:(DE-Juel1)JSC-20090406 001019123 9801_ $$aFullTexts