000884765 001__ 884765
000884765 005__ 20220923174108.0
000884765 020__ $$a978-3-030-50436-6
000884765 0247_ $$2doi$$a10.1007/978-3-030-50436-6_31
000884765 0247_ $$2Handle$$a2128/25774
000884765 0247_ $$2WOS$$aWOS:000841686400031
000884765 037__ $$aFZJ-2020-03241
000884765 1001_ $$0P:(DE-Juel1)169856$$aMorgenstern, Laura$$b0$$eCorresponding author$$ufzj
000884765 1112_ $$aInternational Conference on Computational Science 2020$$cAmsterdam$$d2020-06-03 - 2020-06-05$$gICCS 2020$$wThe Netherlands
000884765 245__ $$aNUMA-Awareness as a Plug-In for an Eventify-Based Fast Multipole Method
000884765 260__ $$c2020
000884765 29510 $$aComputational Science – ICCS 2020
000884765 300__ $$a428-441
000884765 3367_ $$2ORCID$$aCONFERENCE_PAPER
000884765 3367_ $$033$$2EndNote$$aConference Paper
000884765 3367_ $$2BibTeX$$aINPROCEEDINGS
000884765 3367_ $$2DRIVER$$aconferenceObject
000884765 3367_ $$2DataCite$$aOutput Types/Conference Paper
000884765 3367_ $$0PUB:(DE-HGF)8$$2PUB:(DE-HGF)$$aContribution to a conference proceedings$$bcontrib$$mcontrib$$s1601298347_11440
000884765 3367_ $$0PUB:(DE-HGF)7$$2PUB:(DE-HGF)$$aContribution to a book$$mcontb
000884765 520__ $$aFollowing the trend towards Exascale, today’s supercomputers consist of increasingly complex and heterogeneous compute nodes. To exploit the performance of these systems, research software in HPC needs to keep up with the rapid development of hardware architectures. Since manual tuning of software to each and every architecture is neither sustainable nor viable, we aim to tackle this challenge through appropriate software design. In this article, we aim to improve the performance and sustainability of FMSolvr, a parallel Fast Multipole Method for Molecular Dynamics, by adapting it to Non-Uniform Memory Access architectures in a portable and maintainable way. The parallelization of FMSolvr is based on Eventify, an event-based tasking framework we co-developed with FMSolvr. We describe a layered software architecture that enables the separation of the Fast Multipole Method from its parallelization. The focus of this article is on the development and analysis of a reusable NUMA module that improves performance while keeping both layers separated to preserve maintainability and extensibility. By means of the NUMA module we introduce diverse NUMA-aware data distribution, thread pinning and work stealing policies for FMSolvr. During the performance analysis the modular design of the NUMA module was advantageous since it facilitates combination, interchange and redesign of the developed policies. The performance analysis reveals that the runtime of FMSolvr is reduced by 21% from 1.48 ms to 1.16 ms through these policies.
000884765 536__ $$0G:(DE-HGF)POF3-511$$a511 - Computational Science and Mathematical Methods (POF3-511)$$cPOF3-511$$fPOF III$$x0
000884765 536__ $$0G:(DE-Juel1)PHD-NO-GRANT-20170405$$aPhD no Grant - Doktorand ohne besondere Förderung (PHD-NO-GRANT-20170405)$$cPHD-NO-GRANT-20170405$$x1
000884765 7001_ $$0P:(DE-Juel1)161429$$aHaensel, David$$b1$$ufzj
000884765 7001_ $$0P:(DE-Juel1)157750$$aBeckmann, Andreas$$b2$$ufzj
000884765 7001_ $$0P:(DE-Juel1)132152$$aKabadshow, Ivo$$b3$$ufzj
000884765 773__ $$a10.1007/978-3-030-50436-6_31
000884765 8564_ $$uhttps://juser.fz-juelich.de/record/884765/files/Morgenstern2020_Chapter_NUMA-AwarenessAsAPlug-InForAnE.pdf$$yOpenAccess
000884765 8564_ $$uhttps://juser.fz-juelich.de/record/884765/files/Morgenstern2020_Chapter_NUMA-AwarenessAsAPlug-InForAnE.pdf?subformat=pdfa$$xpdfa$$yOpenAccess
000884765 909CO $$ooai:juser.fz-juelich.de:884765$$pdnbdelivery$$pdriver$$pVDB$$popen_access$$popenaire
000884765 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)169856$$aForschungszentrum Jülich$$b0$$kFZJ
000884765 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)161429$$aForschungszentrum Jülich$$b1$$kFZJ
000884765 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)157750$$aForschungszentrum Jülich$$b2$$kFZJ
000884765 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)132152$$aForschungszentrum Jülich$$b3$$kFZJ
000884765 9131_ $$0G:(DE-HGF)POF3-511$$1G:(DE-HGF)POF3-510$$2G:(DE-HGF)POF3-500$$3G:(DE-HGF)POF3$$4G:(DE-HGF)POF$$aDE-HGF$$bKey Technologies$$lSupercomputing & Big Data$$vComputational Science and Mathematical Methods$$x0
000884765 9141_ $$y2020
000884765 915__ $$0StatID:(DE-HGF)0510$$2StatID$$aOpenAccess
000884765 920__ $$lyes
000884765 9201_ $$0I:(DE-Juel1)JSC-20090406$$kJSC$$lJülich Supercomputing Center$$x0
000884765 9201_ $$0I:(DE-Juel1)IAS-7-20180321$$kIAS-7$$lZivile Sicherheitsforschung$$x1
000884765 980__ $$acontrib
000884765 980__ $$aVDB
000884765 980__ $$aUNRESTRICTED
000884765 980__ $$acontb
000884765 980__ $$aI:(DE-Juel1)JSC-20090406
000884765 980__ $$aI:(DE-Juel1)IAS-7-20180321
000884765 9801_ $$aFullTexts