001     884765
005     20220923174108.0
020 _ _ |a 978-3-030-50436-6
024 7 _ |a 10.1007/978-3-030-50436-6_31
|2 doi
024 7 _ |a 2128/25774
|2 Handle
024 7 _ |a WOS:000841686400031
|2 WOS
037 _ _ |a FZJ-2020-03241
100 1 _ |a Morgenstern, Laura
|0 P:(DE-Juel1)169856
|b 0
|e Corresponding author
|u fzj
111 2 _ |a International Conference on Computational Science 2020
|g ICCS 2020
|c Amsterdam
|d 2020-06-03 - 2020-06-05
|w The Netherlands
245 _ _ |a NUMA-Awareness as a Plug-In for an Eventify-Based Fast Multipole Method
260 _ _ |c 2020
295 1 0 |a Computational Science – ICCS 2020
300 _ _ |a 428-441
336 7 _ |a CONFERENCE_PAPER
|2 ORCID
336 7 _ |a Conference Paper
|0 33
|2 EndNote
336 7 _ |a INPROCEEDINGS
|2 BibTeX
336 7 _ |a conferenceObject
|2 DRIVER
336 7 _ |a Output Types/Conference Paper
|2 DataCite
336 7 _ |a Contribution to a conference proceedings
|b contrib
|m contrib
|0 PUB:(DE-HGF)8
|s 1601298347_11440
|2 PUB:(DE-HGF)
336 7 _ |a Contribution to a book
|0 PUB:(DE-HGF)7
|2 PUB:(DE-HGF)
|m contb
520 _ _ |a Following the trend towards Exascale, today’s supercomputers consist of increasingly complex and heterogeneous compute nodes. To exploit the performance of these systems, research software in HPC needs to keep up with the rapid development of hardware architectures. Since manual tuning of software to each and every architecture is neither sustainable nor viable, we aim to tackle this challenge through appropriate software design. In this article, we aim to improve the performance and sustainability of FMSolvr, a parallel Fast Multipole Method for Molecular Dynamics, by adapting it to Non-Uniform Memory Access architectures in a portable and maintainable way. The parallelization of FMSolvr is based on Eventify, an event-based tasking framework we co-developed with FMSolvr. We describe a layered software architecture that enables the separation of the Fast Multipole Method from its parallelization. The focus of this article is on the development and analysis of a reusable NUMA module that improves performance while keeping both layers separated to preserve maintainability and extensibility. By means of the NUMA module we introduce diverse NUMA-aware data distribution, thread pinning and work stealing policies for FMSolvr. During the performance analysis the modular design of the NUMA module was advantageous since it facilitates combination, interchange and redesign of the developed policies. The performance analysis reveals that the runtime of FMSolvr is reduced by 21% from 1.48 ms to 1.16 ms through these policies.
536 _ _ |a 511 - Computational Science and Mathematical Methods (POF3-511)
|0 G:(DE-HGF)POF3-511
|c POF3-511
|f POF III
|x 0
536 _ _ |0 G:(DE-Juel1)PHD-NO-GRANT-20170405
|x 1
|c PHD-NO-GRANT-20170405
|a PhD no Grant - Doktorand ohne besondere Förderung (PHD-NO-GRANT-20170405)
700 1 _ |a Haensel, David
|0 P:(DE-Juel1)161429
|b 1
|u fzj
700 1 _ |a Beckmann, Andreas
|0 P:(DE-Juel1)157750
|b 2
|u fzj
700 1 _ |a Kabadshow, Ivo
|0 P:(DE-Juel1)132152
|b 3
|u fzj
773 _ _ |a 10.1007/978-3-030-50436-6_31
856 4 _ |y OpenAccess
|u https://juser.fz-juelich.de/record/884765/files/Morgenstern2020_Chapter_NUMA-AwarenessAsAPlug-InForAnE.pdf
856 4 _ |y OpenAccess
|x pdfa
|u https://juser.fz-juelich.de/record/884765/files/Morgenstern2020_Chapter_NUMA-AwarenessAsAPlug-InForAnE.pdf?subformat=pdfa
909 C O |o oai:juser.fz-juelich.de:884765
|p openaire
|p open_access
|p VDB
|p driver
|p dnbdelivery
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 0
|6 P:(DE-Juel1)169856
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 1
|6 P:(DE-Juel1)161429
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 2
|6 P:(DE-Juel1)157750
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 3
|6 P:(DE-Juel1)132152
913 1 _ |a DE-HGF
|b Key Technologies
|1 G:(DE-HGF)POF3-510
|0 G:(DE-HGF)POF3-511
|2 G:(DE-HGF)POF3-500
|v Computational Science and Mathematical Methods
|x 0
|4 G:(DE-HGF)POF
|3 G:(DE-HGF)POF3
|l Supercomputing & Big Data
914 1 _ |y 2020
915 _ _ |a OpenAccess
|0 StatID:(DE-HGF)0510
|2 StatID
920 _ _ |l yes
920 1 _ |0 I:(DE-Juel1)JSC-20090406
|k JSC
|l Jülich Supercomputing Center
|x 0
920 1 _ |0 I:(DE-Juel1)IAS-7-20180321
|k IAS-7
|l Zivile Sicherheitsforschung
|x 1
980 _ _ |a contrib
980 _ _ |a VDB
980 _ _ |a UNRESTRICTED
980 _ _ |a contb
980 _ _ |a I:(DE-Juel1)JSC-20090406
980 _ _ |a I:(DE-Juel1)IAS-7-20180321
980 1 _ |a FullTexts


LibraryCollectionCLSMajorCLSMinorLanguageAuthor
Marc 21