Accelerating Plasma Physics with GPUs

Herten, Andreas; Brömmel, Dirk; Pleiter, Dirk
000824109 001__ 824109
000824109 005__ 20210129224942.0
000824109 037__ $$aFZJ-2016-06733
000824109 041__ $$aEnglish
000824109 1001_ $$0P:(DE-Juel1)145478$$aHerten, Andreas$$b0$$eCorresponding author$$ufzj
000824109 1112_ $$aPerspectives of GPU computing in Science$$cRome$$d2016-09-26 - 2016-09-28$$gGPU2016$$wItaly
000824109 245__ $$aAccelerating Plasma Physics with GPUs
000824109 260__ $$c2016
000824109 3367_ $$033$$2EndNote$$aConference Paper
000824109 3367_ $$2BibTeX$$aINPROCEEDINGS
000824109 3367_ $$2DRIVER$$aconferenceObject
000824109 3367_ $$2ORCID$$aCONFERENCE_POSTER
000824109 3367_ $$2DataCite$$aOutput Types/Conference Poster
000824109 3367_ $$0PUB:(DE-HGF)24$$2PUB:(DE-HGF)$$aPoster$$bposter$$mposter$$s1480074937_10113$$xAfter Call
000824109 502__ $$cSapienza Università di Roma
000824109 520__ $$aJuSPIC is a particle-in-cell (PIC) code, developed in the Simulation Lab for Plasma Physics of the Jülich Supercomputing Centre. The open source code is based on PSC by H. Ruhl, slimmed-down and rewritten in modern Fortran. JuSPIC simulates particles under the influence of electromagnetic fields, using the relativistic Vlasov equation and Maxwell's equations (integrated using the Finite Difference Time Domain scheme). The program uses a regular mesh for the Maxwell fields and the particle charge/current densities. Inside the mesh, quasi-particles with continuous coordinates are modeled via distribution functions.  JuSPIC is part of the High-Q club, attesting that it can efficiently scale to the full JUQUEEN supercomputer (currently the #13 on the Top 500 list): 1.8 million threads running on 458 thousand cores can collaboratively compute plasma simulations. Local node-level parallelism is achieved by means of OpenMP, communication between nodes relies on MPI.  To leverage the latest generation of supercomputers coming equipped with dedicated accelerator technologies (GPUs and other many-core architectures), JuSPIC is currently being extended. In this poster we present a GPU-accelerated version of the program, making use of different programming models. We show first results of performance studies, comparing OpenACC and CUDA. While OpenACC aims to offer portability and flexibility by means of few changes to the code, the performance of the generated program might suffer in practice. To measure the deficit, the compute-intensive parts of the program are in addition also implemented in CUDA Fortran. To explore scalability properties of the application for static particle distributions on a heterogeneous architecture, we make use of semi-empirical performance models.
000824109 536__ $$0G:(DE-HGF)POF3-511$$a511 - Computational Science and Mathematical Methods (POF3-511)$$cPOF3-511$$fPOF III$$x0
000824109 7001_ $$0P:(DE-Juel1)144441$$aPleiter, Dirk$$b1$$ufzj
000824109 7001_ $$0P:(DE-Juel1)143606$$aBrömmel, Dirk$$b2$$ufzj
000824109 909CO $$ooai:juser.fz-juelich.de:824109$$pVDB
000824109 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)145478$$aForschungszentrum Jülich$$b0$$kFZJ
000824109 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)144441$$aForschungszentrum Jülich$$b1$$kFZJ
000824109 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)143606$$aForschungszentrum Jülich$$b2$$kFZJ
000824109 9131_ $$0G:(DE-HGF)POF3-511$$1G:(DE-HGF)POF3-510$$2G:(DE-HGF)POF3-500$$3G:(DE-HGF)POF3$$4G:(DE-HGF)POF$$aDE-HGF$$bKey Technologies$$lSupercomputing & Big Data$$vComputational Science and Mathematical Methods$$x0
000824109 9141_ $$y2016
000824109 915__ $$0StatID:(DE-HGF)0550$$2StatID$$aNo Authors Fulltext
000824109 9201_ $$0I:(DE-Juel1)JSC-20090406$$kJSC$$lJülich Supercomputing Center$$x0
000824109 980__ $$aposter
000824109 980__ $$aVDB
000824109 980__ $$aUNRESTRICTED
000824109 980__ $$aI:(DE-Juel1)JSC-20090406
guest :: login JuSER
		Search		Submit		Personalize Your alerts Your baskets Your searches		Help