001     1026442
005     20260122235033.0
024 7 _ |2 doi
|a 10.5194/gmd-17-4077-2024
024 7 _ |2 ISSN
|a 1991-959X
024 7 _ |2 ISSN
|a 1991-9603
024 7 _ |2 datacite_doi
|a 10.34734/FZJ-2024-03395
024 7 _ |2 WOS
|a WOS:001226505800001
037 _ _ |a FZJ-2024-03395
041 _ _ |a English
082 _ _ |a 550
100 1 _ |0 P:(DE-Juel1)129125
|a Hoffmann, Lars
|b 0
|e Corresponding author
|u fzj
245 _ _ |a Accelerating Lagrangian transport simulations on graphics processing units: performance optimizations of Massive-Parallel Trajectory Calculations (MPTRAC) v2.6
260 _ _ |a Katlenburg-Lindau
|b Copernicus
|c 2024
336 7 _ |2 DRIVER
|a article
336 7 _ |2 DataCite
|a Output Types/Journal article
336 7 _ |0 PUB:(DE-HGF)16
|2 PUB:(DE-HGF)
|a Journal Article
|b journal
|m journal
|s 1717745098_3799
336 7 _ |2 BibTeX
|a ARTICLE
336 7 _ |2 ORCID
|a JOURNAL_ARTICLE
336 7 _ |0 0
|2 EndNote
|a Journal Article
520 _ _ |a Lagrangian particle dispersion models are indispensable tools for the study of atmospheric transport processes. However, Lagrangian transport simulations can become numerically expensive when large numbers of air parcels are involved. To accelerate these simulations, we made considerable efforts to port the Massive-Parallel Trajectory Calculations (MPTRAC) model to graphics processing units (GPUs). Here we discuss performance optimizations of the major bottleneck of the GPU code of MPTRAC, the advection kernel. Timeline, roofline, and memory analyses of the baseline GPU code revealed that the application is memory-bound, and performance suffers from near-random memory access patterns. By changing the data structure of the horizontal wind and vertical velocity fields of the global meteorological data driving the simulations from structure of arrays (SoAs) to array of structures (AoSs) and by introducing a sorting method for better memory alignment of the particle data, performance was greatly improved. We evaluated the performance on NVIDIA A100 GPUs of the Jülich Wizard for European Leadership Science (JUWELS) Booster module at the Jülich Supercomputing Center, Germany. For our largest test case, transport simulations with 108 particles driven by the European Centre for Medium-Range Weather Forecasts (ECMWF) ERA5 reanalysis, we found that the runtime for the full set of physics computations was reduced by 75 %, including a reduction of 85 % for the advection kernel. In addition to demonstrating the benefits of code optimization for GPUs, we show that the runtime of central processing unit (CPU-)only simulations is also improved. For our largest test case, we found a runtime reduction of 34 % for the physics computations, including a reduction of 65 % for the advection kernel. The code optimizations discussed here bring the MPTRAC model closer to applications on upcoming exascale high-performance computing systems and will also be of interest for optimizing the performance of other models using particle methods.
536 _ _ |0 G:(DE-HGF)POF4-5111
|a 5111 - Domain-Specific Simulation & Data Life Cycle Labs (SDLs) and Research Groups (POF4-511)
|c POF4-511
|f POF IV
|x 0
536 _ _ |0 G:(DE-HGF)POF4-2112
|a 2112 - Climate Feedbacks (POF4-211)
|c POF4-211
|f POF IV
|x 1
536 _ _ |0 G:(DE-HGF)POF4-5122
|a 5122 - Future Computing & Big Data Systems (POF4-512)
|c POF4-512
|f POF IV
|x 2
536 _ _ |0 G:(DE-HGF)POF4-5112
|a 5112 - Cross-Domain Algorithms, Tools, Methods Labs (ATMLs) and Research Groups (POF4-511)
|c POF4-511
|f POF IV
|x 3
536 _ _ |0 G:(DE-Juel-1)ATML-X-DEV
|a ATML-X-DEV - ATML Accelerating Devices (ATML-X-DEV)
|c ATML-X-DEV
|x 4
536 _ _ |0 G:(DE-Juel-1)SDLCS
|a Simulation and Data Lab Climate Science
|c SDLCS
|x 5
588 _ _ |a Dataset connected to CrossRef, Journals: juser.fz-juelich.de
700 1 _ |0 P:(DE-Juel1)176293
|a Haghighi Mood, Kaveh
|b 1
700 1 _ |0 P:(DE-Juel1)145478
|a Herten, Andreas
|b 2
700 1 _ |0 P:(DE-HGF)0
|a Hrywniak, Markus
|b 3
700 1 _ |0 P:(DE-HGF)0
|a Kraus, Jiri
|b 4
700 1 _ |0 P:(DE-Juel1)180256
|a Clemens, Jan
|b 5
700 1 _ |0 P:(DE-Juel1)187051
|a Liu, Mingzhao
|b 6
773 _ _ |0 PERI:(DE-600)2456725-5
|a 10.5194/gmd-17-4077-2024
|g Vol. 17, no. 9, p. 4077 - 4094
|n 9
|p 4077 - 4094
|t Geoscientific model development
|v 17
|x 1991-959X
|y 2024
856 4 _ |u https://juser.fz-juelich.de/record/1026442/files/Invoice_Helmholtz-PUC-2024-57.pdf
856 4 _ |u https://juser.fz-juelich.de/record/1026442/files/Invoice_Helmholtz-PUC-2024-57.gif?subformat=icon
|x icon
856 4 _ |u https://juser.fz-juelich.de/record/1026442/files/Invoice_Helmholtz-PUC-2024-57.jpg?subformat=icon-1440
|x icon-1440
856 4 _ |u https://juser.fz-juelich.de/record/1026442/files/Invoice_Helmholtz-PUC-2024-57.jpg?subformat=icon-180
|x icon-180
856 4 _ |u https://juser.fz-juelich.de/record/1026442/files/Invoice_Helmholtz-PUC-2024-57.jpg?subformat=icon-640
|x icon-640
856 4 _ |u https://juser.fz-juelich.de/record/1026442/files/gmd-17-4077-2024.pdf
|y OpenAccess
856 4 _ |u https://juser.fz-juelich.de/record/1026442/files/gmd-17-4077-2024.gif?subformat=icon
|x icon
|y OpenAccess
856 4 _ |u https://juser.fz-juelich.de/record/1026442/files/gmd-17-4077-2024.jpg?subformat=icon-1440
|x icon-1440
|y OpenAccess
856 4 _ |u https://juser.fz-juelich.de/record/1026442/files/gmd-17-4077-2024.jpg?subformat=icon-180
|x icon-180
|y OpenAccess
856 4 _ |u https://juser.fz-juelich.de/record/1026442/files/gmd-17-4077-2024.jpg?subformat=icon-640
|x icon-640
|y OpenAccess
909 C O |o oai:juser.fz-juelich.de:1026442
|p openaire
|p open_access
|p OpenAPC
|p driver
|p VDB
|p openCost
|p dnbdelivery
910 1 _ |0 I:(DE-588b)5008462-8
|6 P:(DE-Juel1)129125
|a Forschungszentrum Jülich
|b 0
|k FZJ
910 1 _ |0 I:(DE-588b)5008462-8
|6 P:(DE-Juel1)176293
|a Forschungszentrum Jülich
|b 1
|k FZJ
910 1 _ |0 I:(DE-588b)5008462-8
|6 P:(DE-Juel1)145478
|a Forschungszentrum Jülich
|b 2
|k FZJ
910 1 _ |0 I:(DE-588b)5008462-8
|6 P:(DE-Juel1)180256
|a Forschungszentrum Jülich
|b 5
|k FZJ
910 1 _ |0 I:(DE-588b)5008462-8
|6 P:(DE-Juel1)187051
|a Forschungszentrum Jülich
|b 6
|k FZJ
913 1 _ |0 G:(DE-HGF)POF4-511
|1 G:(DE-HGF)POF4-510
|2 G:(DE-HGF)POF4-500
|3 G:(DE-HGF)POF4
|4 G:(DE-HGF)POF
|9 G:(DE-HGF)POF4-5111
|a DE-HGF
|b Key Technologies
|l Engineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action
|v Enabling Computational- & Data-Intensive Science and Engineering
|x 0
913 1 _ |0 G:(DE-HGF)POF4-211
|1 G:(DE-HGF)POF4-210
|2 G:(DE-HGF)POF4-200
|3 G:(DE-HGF)POF4
|4 G:(DE-HGF)POF
|9 G:(DE-HGF)POF4-2112
|a DE-HGF
|b Forschungsbereich Erde und Umwelt
|l Erde im Wandel – Unsere Zukunft nachhaltig gestalten
|v Die Atmosphäre im globalen Wandel
|x 1
913 1 _ |0 G:(DE-HGF)POF4-512
|1 G:(DE-HGF)POF4-510
|2 G:(DE-HGF)POF4-500
|3 G:(DE-HGF)POF4
|4 G:(DE-HGF)POF
|9 G:(DE-HGF)POF4-5122
|a DE-HGF
|b Key Technologies
|l Engineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action
|v Supercomputing & Big Data Infrastructures
|x 2
913 1 _ |0 G:(DE-HGF)POF4-511
|1 G:(DE-HGF)POF4-510
|2 G:(DE-HGF)POF4-500
|3 G:(DE-HGF)POF4
|4 G:(DE-HGF)POF
|9 G:(DE-HGF)POF4-5112
|a DE-HGF
|b Key Technologies
|l Engineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action
|v Enabling Computational- & Data-Intensive Science and Engineering
|x 3
914 1 _ |y 2024
915 p c |0 PC:(DE-HGF)0000
|2 APC
|a APC keys set
915 p c |0 PC:(DE-HGF)0003
|2 APC
|a DOAJ Journal
915 _ _ |0 StatID:(DE-HGF)0160
|2 StatID
|a DBCoverage
|b Essential Science Indicators
|d 2023-10-25
915 _ _ |0 LIC:(DE-HGF)CCBY4
|2 HGFVOC
|a Creative Commons Attribution CC BY 4.0
915 _ _ |0 StatID:(DE-HGF)0501
|2 StatID
|a DBCoverage
|b DOAJ Seal
|d 2022-12-20T09:29:04Z
915 _ _ |0 StatID:(DE-HGF)0500
|2 StatID
|a DBCoverage
|b DOAJ
|d 2022-12-20T09:29:04Z
915 _ _ |0 StatID:(DE-HGF)0113
|2 StatID
|a WoS
|b Science Citation Index Expanded
|d 2023-10-25
915 _ _ |0 StatID:(DE-HGF)0700
|2 StatID
|a Fees
|d 2023-10-25
915 _ _ |0 StatID:(DE-HGF)0510
|2 StatID
|a OpenAccess
915 _ _ |0 StatID:(DE-HGF)0561
|2 StatID
|a Article Processing Charges
|d 2023-10-25
915 _ _ |0 StatID:(DE-HGF)0200
|2 StatID
|a DBCoverage
|b SCOPUS
|d 2024-12-21
915 _ _ |0 StatID:(DE-HGF)0300
|2 StatID
|a DBCoverage
|b Medline
|d 2024-12-21
915 _ _ |0 StatID:(DE-HGF)0030
|2 StatID
|a Peer Review
|b DOAJ : Open peer review
|d 2022-12-20T09:29:04Z
915 _ _ |0 StatID:(DE-HGF)0600
|2 StatID
|a DBCoverage
|b Ebsco Academic Search
|d 2024-12-21
915 _ _ |0 StatID:(DE-HGF)0030
|2 StatID
|a Peer Review
|b ASC
|d 2024-12-21
915 _ _ |0 StatID:(DE-HGF)0199
|2 StatID
|a DBCoverage
|b Clarivate Analytics Master Journal List
|d 2024-12-21
915 _ _ |0 StatID:(DE-HGF)1150
|2 StatID
|a DBCoverage
|b Current Contents - Physical, Chemical and Earth Sciences
|d 2024-12-21
915 _ _ |0 StatID:(DE-HGF)0150
|2 StatID
|a DBCoverage
|b Web of Science Core Collection
|d 2024-12-21
920 _ _ |l yes
920 1 _ |0 I:(DE-Juel1)JSC-20090406
|k JSC
|l Jülich Supercomputing Center
|x 0
920 1 _ |0 I:(DE-Juel1)IEK-7-20101013
|k IEK-7
|l Stratosphäre
|x 1
920 1 _ |0 I:(DE-Juel1)CASA-20230315
|k CASA
|l Center for Advanced Simulation and Analytics
|x 2
980 1 _ |a APC
980 1 _ |a FullTexts
980 _ _ |a journal
980 _ _ |a VDB
980 _ _ |a UNRESTRICTED
980 _ _ |a I:(DE-Juel1)JSC-20090406
980 _ _ |a I:(DE-Juel1)IEK-7-20101013
980 _ _ |a I:(DE-Juel1)CASA-20230315
980 _ _ |a APC
981 _ _ |a I:(DE-Juel1)ICE-4-20101013


LibraryCollectionCLSMajorCLSMinorLanguageAuthor
Marc 21