001     1029548
005     20250106213406.0
020 _ _ |a 978-3-95806-765-3
024 7 _ |2 K10Plus
|a K10Plus:1896603777
024 7 _ |2 datacite_doi
|a 10.34734/FZJ-2024-05160
024 7 _ |2 URN
|a urn:nbn:de:0001-20250106145649658-2552643-6
037 _ _ |a FZJ-2024-05160
100 1 _ |0 P:(DE-Juel1)169856
|a Morgenstern, Laura
|b 0
|e Corresponding author
245 _ _ |a Eventify Meets Heterogeneity: Enabling Fine-Grained Task-Parallelism on GPUs
|c Laura Morgenstern
|f - 2023-08-15
260 _ _ |a Jülich
|b Forschungszentrum Jülich GmbH, Zentralbibliothek, Verlag
|c 2024
300 _ _ |a xv, 110 Seiten : Illustrationen, Diagramme
336 7 _ |2 DataCite
|a Output Types/Dissertation
336 7 _ |0 PUB:(DE-HGF)3
|2 PUB:(DE-HGF)
|a Book
|m book
336 7 _ |2 ORCID
|a DISSERTATION
336 7 _ |2 BibTeX
|a PHDTHESIS
336 7 _ |0 2
|2 EndNote
|a Thesis
336 7 _ |0 PUB:(DE-HGF)11
|2 PUB:(DE-HGF)
|a Dissertation / PhD Thesis
|b phd
|m phd
|s 1724056284_9734
336 7 _ |2 DRIVER
|a doctoralThesis
490 0 _ |a Schriften des Forschungszentrums Jülich IAS Series
|v 63
502 _ _ |a Dissertation, Techn. Univ. Chemnitz, 2023
|b Dissertation
|c Techn. Univ. Chemnitz
|d 2023
|o 2023-08-15
520 _ _ |a Many scientific computing algorithms barely provide sufficient data-parallelism to exploit the ever-increasing hardware parallelism of today’s heterogeneous computing environments. The challenge is to fully exploit the parallelization potential of such algorithms. To tackle this challenge, diverse task-parallel programming technologies have been introduced that allow for the flexible description of algorithms along task graphs. For algorithms with dense task graphs, however, taskparallelism is still hard to exploit efficiently since it is programmatically complex to describe and imposes high dependency resolution overheads on the execution model. This becomes especially challenging on GPUs which are not designed for synchronization-heavy applications. The research objective of this thesis is an execution model that enables fine-grained task parallelism on GPUs. To reach this objective, the contributions of the thesis are five fold. Firstly, it refines the stream interaction model behind Flynn’s Taxonomy as uniform foundation forconcurrency in architectures and programming models. Secondly, it analyzes the quantitative trends in CPU and GPU architectures and examines their influence on programming models. Thirdly, it introduces an execution model that enables threading, efficient blocking synchronization and queue-based task scheduling on GPUs. Fourthly, it ports the task-parallel programming library Eventify to GPUs. And fifthly, it examines the performance and sustainability of this approach with the task graph of a fast multipole method as use case. The results show that fine-grained task parallelism improves execution time by an order of magnitude in comparison to classical loop-based data parallelism.
536 _ _ |0 G:(DE-HGF)POF4-5112
|a 5112 - Cross-Domain Algorithms, Tools, Methods Labs (ATMLs) and Research Groups (POF4-511)
|c POF4-511
|f POF IV
|x 0
588 _ _ |a Dataset connected to K10Plus
856 4 _ |u https://juser.fz-juelich.de/record/1029548/files/IAS_Series_63.pdf
|y OpenAccess
856 4 _ |u https://juser.fz-juelich.de/record/1029548/files/IAS_Series_63.gif?subformat=icon
|x icon
|y OpenAccess
856 4 _ |u https://juser.fz-juelich.de/record/1029548/files/IAS_Series_63.jpg?subformat=icon-1440
|x icon-1440
|y OpenAccess
856 4 _ |u https://juser.fz-juelich.de/record/1029548/files/IAS_Series_63.jpg?subformat=icon-180
|x icon-180
|y OpenAccess
856 4 _ |u https://juser.fz-juelich.de/record/1029548/files/IAS_Series_63.jpg?subformat=icon-640
|x icon-640
|y OpenAccess
909 C O |o oai:juser.fz-juelich.de:1029548
|p openaire
|p open_access
|p urn
|p driver
|p VDB
|p dnbdelivery
910 1 _ |0 I:(DE-588b)5008462-8
|6 P:(DE-Juel1)169856
|a Forschungszentrum Jülich
|b 0
|k FZJ
913 1 _ |0 G:(DE-HGF)POF4-511
|1 G:(DE-HGF)POF4-510
|2 G:(DE-HGF)POF4-500
|3 G:(DE-HGF)POF4
|4 G:(DE-HGF)POF
|9 G:(DE-HGF)POF4-5112
|a DE-HGF
|b Key Technologies
|l Engineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action
|v Enabling Computational- & Data-Intensive Science and Engineering
|x 0
914 1 _ |y 2024
915 _ _ |0 StatID:(DE-HGF)0510
|2 StatID
|a OpenAccess
915 _ _ |0 LIC:(DE-HGF)CCBY4
|2 HGFVOC
|a Creative Commons Attribution CC BY 4.0
920 _ _ |l yes
920 1 _ |0 I:(DE-Juel1)JSC-20090406
|k JSC
|l Jülich Supercomputing Center
|x 0
980 _ _ |a phd
980 _ _ |a VDB
980 _ _ |a UNRESTRICTED
980 _ _ |a book
980 _ _ |a I:(DE-Juel1)JSC-20090406
980 1 _ |a FullTexts


LibraryCollectionCLSMajorCLSMinorLanguageAuthor
Marc 21