001029548 001__ 1029548
001029548 005__ 20250106213406.0
001029548 0247_ $$2K10Plus$$aK10Plus:1896603777
001029548 0247_ $$2datacite_doi$$a10.34734/FZJ-2024-05160
001029548 0247_ $$2URN$$aurn:nbn:de:0001-20250106145649658-2552643-6
001029548 020__ $$a978-3-95806-765-3
001029548 037__ $$aFZJ-2024-05160
001029548 1001_ $$0P:(DE-Juel1)169856$$aMorgenstern, Laura$$b0$$eCorresponding author
001029548 245__ $$aEventify Meets Heterogeneity: Enabling Fine-Grained Task-Parallelism on GPUs$$cLaura Morgenstern$$f- 2023-08-15
001029548 260__ $$aJülich$$bForschungszentrum Jülich GmbH, Zentralbibliothek, Verlag$$c2024
001029548 300__ $$axv, 110 Seiten : Illustrationen, Diagramme
001029548 3367_ $$2DataCite$$aOutput Types/Dissertation
001029548 3367_ $$0PUB:(DE-HGF)3$$2PUB:(DE-HGF)$$aBook$$mbook
001029548 3367_ $$2ORCID$$aDISSERTATION
001029548 3367_ $$2BibTeX$$aPHDTHESIS
001029548 3367_ $$02$$2EndNote$$aThesis
001029548 3367_ $$0PUB:(DE-HGF)11$$2PUB:(DE-HGF)$$aDissertation / PhD Thesis$$bphd$$mphd$$s1724056284_9734
001029548 3367_ $$2DRIVER$$adoctoralThesis
001029548 4900_ $$aSchriften des Forschungszentrums Jülich IAS Series$$v63
001029548 502__ $$aDissertation, Techn. Univ. Chemnitz, 2023$$bDissertation$$cTechn. Univ. Chemnitz$$d2023$$o2023-08-15
001029548 520__ $$aMany scientific computing algorithms barely provide sufficient data-parallelism to exploit the ever-increasing hardware parallelism of today’s heterogeneous computing environments. The challenge is to fully exploit the parallelization potential of such algorithms. To tackle this challenge, diverse task-parallel programming technologies have been introduced that allow for the flexible description of algorithms along task graphs. For algorithms with dense task graphs, however, taskparallelism is still hard to exploit efficiently since it is programmatically complex to describe and imposes high dependency resolution overheads on the execution model. This becomes especially challenging on GPUs which are not designed for synchronization-heavy applications. The research objective of this thesis is an execution model that enables fine-grained task parallelism on GPUs. To reach this objective, the contributions of the thesis are five fold. Firstly, it refines the stream interaction model behind Flynn’s Taxonomy as uniform foundation forconcurrency in architectures and programming models. Secondly, it analyzes the quantitative trends in CPU and GPU architectures and examines their influence on programming models. Thirdly, it introduces an execution model that enables threading, efficient blocking synchronization and queue-based task scheduling on GPUs. Fourthly, it ports the task-parallel programming library Eventify to GPUs. And fifthly, it examines the performance and sustainability of this approach with the task graph of a fast multipole method as use case. The results show that fine-grained task parallelism improves execution time by an order of magnitude in comparison to classical loop-based data parallelism.
001029548 536__ $$0G:(DE-HGF)POF4-5112$$a5112 - Cross-Domain Algorithms, Tools, Methods Labs (ATMLs) and Research Groups (POF4-511)$$cPOF4-511$$fPOF IV$$x0
001029548 588__ $$aDataset connected to K10Plus
001029548 8564_ $$uhttps://juser.fz-juelich.de/record/1029548/files/IAS_Series_63.pdf$$yOpenAccess
001029548 8564_ $$uhttps://juser.fz-juelich.de/record/1029548/files/IAS_Series_63.gif?subformat=icon$$xicon$$yOpenAccess
001029548 8564_ $$uhttps://juser.fz-juelich.de/record/1029548/files/IAS_Series_63.jpg?subformat=icon-1440$$xicon-1440$$yOpenAccess
001029548 8564_ $$uhttps://juser.fz-juelich.de/record/1029548/files/IAS_Series_63.jpg?subformat=icon-180$$xicon-180$$yOpenAccess
001029548 8564_ $$uhttps://juser.fz-juelich.de/record/1029548/files/IAS_Series_63.jpg?subformat=icon-640$$xicon-640$$yOpenAccess
001029548 909CO $$ooai:juser.fz-juelich.de:1029548$$pVDB$$pdriver$$purn$$popen_access$$popenaire$$pdnbdelivery
001029548 915__ $$0StatID:(DE-HGF)0510$$2StatID$$aOpenAccess
001029548 915__ $$0LIC:(DE-HGF)CCBY4$$2HGFVOC$$aCreative Commons Attribution CC BY 4.0
001029548 9141_ $$y2024
001029548 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)169856$$aForschungszentrum Jülich$$b0$$kFZJ
001029548 9131_ $$0G:(DE-HGF)POF4-511$$1G:(DE-HGF)POF4-510$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5112$$aDE-HGF$$bKey Technologies$$lEngineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action$$vEnabling Computational- & Data-Intensive Science and Engineering$$x0
001029548 920__ $$lyes
001029548 9201_ $$0I:(DE-Juel1)JSC-20090406$$kJSC$$lJülich Supercomputing Center$$x0
001029548 980__ $$aphd
001029548 980__ $$aVDB
001029548 980__ $$aUNRESTRICTED
001029548 980__ $$abook
001029548 980__ $$aI:(DE-Juel1)JSC-20090406
001029548 9801_ $$aFullTexts