000889100 001__ 889100
000889100 005__ 20210127115252.0
000889100 0247_ $$2doi$$a10.1016/j.cpc.2020.107159
000889100 0247_ $$2ISSN$$a0010-4655
000889100 0247_ $$2ISSN$$a1386-9485
000889100 0247_ $$2ISSN$$a1879-2944
000889100 0247_ $$2Handle$$a2128/26662
000889100 0247_ $$2altmetric$$aaltmetric:75734051
000889100 0247_ $$2WOS$$aWOS:000528002400017
000889100 037__ $$aFZJ-2021-00030
000889100 082__ $$a530
000889100 1001_ $$0P:(DE-HGF)0$$aCastagna, Jony$$b0
000889100 245__ $$aTowards extreme scale dissipative particle dynamics simulations using multiple GPGPUs
000889100 260__ $$aAmsterdam$$bNorth Holland Publ. Co.$$c2020
000889100 3367_ $$2DRIVER$$aarticle
000889100 3367_ $$2DataCite$$aOutput Types/Journal article
000889100 3367_ $$0PUB:(DE-HGF)16$$2PUB:(DE-HGF)$$aJournal Article$$bjournal$$mjournal$$s1609857861_18366
000889100 3367_ $$2BibTeX$$aARTICLE
000889100 3367_ $$2ORCID$$aJOURNAL_ARTICLE
000889100 3367_ $$00$$2EndNote$$aJournal Article
000889100 520__ $$aA multi-GPGPU development for Mesoscale Simulations using the Dissipative Particle Dynamics method is presented. This distributed GPU acceleration development is an extension of the DL_MESO package to MPI+CUDA in order to exploit the computational power of the latest NVIDIA cards on hybrid CPU–GPU architectures. Details about the extensively applicable algorithm implementation and memory coalescing data structures are presented. The key algorithms’ optimizations for the nearest-neighbour list searching of particle pairs for short range forces, exchange of data and overlapping between computation and communications are also given. We have carried out strong and weak scaling performance analyses with up to 4096 GPUs. A two phase mixture separation test case with 1.8 billion particles has been run on the Piz Daint supercomputer from the Swiss National Supercomputer Center. With CUDA aware MPI, proper GPU affinity, communication and computation overlap optimizations for multi-GPU version, the final optimization results demonstrated more than 94% efficiency for weak scaling and more than 80% efficiency for strong scaling. As far as we know, this is the first report in the literature of DPD simulations being run on this large number of GPUs. The remaining challenges and future work are also discussed at the end of the paper.
000889100 536__ $$0G:(DE-HGF)POF3-511$$a511 - Computational Science and Mathematical Methods (POF3-511)$$cPOF3-511$$fPOF III$$x0
000889100 536__ $$0G:(EU-Grant)676531$$aE-CAM - An e-infrastructure for software, training and consultancy in simulation and modelling (676531)$$c676531$$fH2020-EINFRA-2015-1$$x1
000889100 536__ $$0G:(DE-Juel1)prcoe02_20181001$$aPRACE CoE Allocation E-CAM (prcoe02_20181001)$$cprcoe02_20181001$$fPRACE CoE Allocation E-CAM$$x2
000889100 588__ $$aDataset connected to CrossRef
000889100 7001_ $$00000-0003-3545-3432$$aGuo, Xiaohu$$b1$$eCorresponding author
000889100 7001_ $$00000-0002-4708-573X$$aSeaton, Michael$$b2
000889100 7001_ $$0P:(DE-Juel1)143791$$aO’Cais, Alan$$b3
000889100 773__ $$0PERI:(DE-600)1466511-6$$a10.1016/j.cpc.2020.107159$$gVol. 251, p. 107159 -$$p107159 -$$tComputer physics communications$$v251$$x0010-4655$$y2020
000889100 8564_ $$uhttps://juser.fz-juelich.de/record/889100/files/1-s2.0-S0010465520300199-main.pdf$$yOpenAccess
000889100 909CO $$ooai:juser.fz-juelich.de:889100$$pdnbdelivery$$pec_fundedresources$$pVDB$$pdriver$$popen_access$$popenaire
000889100 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)143791$$aForschungszentrum Jülich$$b3$$kFZJ
000889100 9131_ $$0G:(DE-HGF)POF3-511$$1G:(DE-HGF)POF3-510$$2G:(DE-HGF)POF3-500$$3G:(DE-HGF)POF3$$4G:(DE-HGF)POF$$aDE-HGF$$bKey Technologies$$lSupercomputing & Big Data$$vComputational Science and Mathematical Methods$$x0
000889100 9141_ $$y2020
000889100 915__ $$0StatID:(DE-HGF)0200$$2StatID$$aDBCoverage$$bSCOPUS$$d2020-09-06
000889100 915__ $$0StatID:(DE-HGF)0160$$2StatID$$aDBCoverage$$bEssential Science Indicators$$d2020-09-06
000889100 915__ $$0LIC:(DE-HGF)CCBY4$$2HGFVOC$$aCreative Commons Attribution CC BY 4.0
000889100 915__ $$0StatID:(DE-HGF)0600$$2StatID$$aDBCoverage$$bEbsco Academic Search$$d2020-09-06
000889100 915__ $$0StatID:(DE-HGF)0100$$2StatID$$aJCR$$bCOMPUT PHYS COMMUN : 2018$$d2020-09-06
000889100 915__ $$0StatID:(DE-HGF)0113$$2StatID$$aWoS$$bScience Citation Index Expanded$$d2020-09-06
000889100 915__ $$0StatID:(DE-HGF)0150$$2StatID$$aDBCoverage$$bWeb of Science Core Collection$$d2020-09-06
000889100 915__ $$0StatID:(DE-HGF)9900$$2StatID$$aIF < 5$$d2020-09-06
000889100 915__ $$0StatID:(DE-HGF)0510$$2StatID$$aOpenAccess
000889100 915__ $$0StatID:(DE-HGF)0030$$2StatID$$aPeer Review$$bASC$$d2020-09-06
000889100 915__ $$0StatID:(DE-HGF)1150$$2StatID$$aDBCoverage$$bCurrent Contents - Physical, Chemical and Earth Sciences$$d2020-09-06
000889100 915__ $$0StatID:(DE-HGF)0300$$2StatID$$aDBCoverage$$bMedline$$d2020-09-06
000889100 915__ $$0StatID:(DE-HGF)0420$$2StatID$$aNationallizenz$$d2020-09-06$$wger
000889100 915__ $$0StatID:(DE-HGF)0199$$2StatID$$aDBCoverage$$bClarivate Analytics Master Journal List$$d2020-09-06
000889100 9201_ $$0I:(DE-Juel1)JSC-20090406$$kJSC$$lJülich Supercomputing Center$$x0
000889100 9801_ $$aFullTexts
000889100 980__ $$ajournal
000889100 980__ $$aVDB
000889100 980__ $$aUNRESTRICTED
000889100 980__ $$aI:(DE-Juel1)JSC-20090406