| Hauptseite > Publikationsdatenbank > A New Spin on the Fast Multipole Method for GPUS: Rethinking the Far-Field Operators > print |
| 001 | 1049541 | ||
| 005 | 20251213202221.0 | ||
| 037 | _ | _ | |a FZJ-2025-05345 |
| 100 | 1 | _ | |a Lengvenis, Arijus |0 P:(DE-Juel1)206763 |b 0 |e Corresponding author |u fzj |
| 111 | 2 | _ | |a 2025 IEEE International Parallel and Distributed Processing Symposium |g IPDPS |c Milano |d 2025-06-03 - 2025-06-07 |w Italy |
| 245 | _ | _ | |a A New Spin on the Fast Multipole Method for GPUS: Rethinking the Far-Field Operators |
| 260 | _ | _ | |c 2025 |
| 336 | 7 | _ | |a Conference Paper |0 33 |2 EndNote |
| 336 | 7 | _ | |a Other |2 DataCite |
| 336 | 7 | _ | |a INPROCEEDINGS |2 BibTeX |
| 336 | 7 | _ | |a conferenceObject |2 DRIVER |
| 336 | 7 | _ | |a LECTURE_SPEECH |2 ORCID |
| 336 | 7 | _ | |a Conference Presentation |b conf |m conf |0 PUB:(DE-HGF)6 |s 1765628766_19099 |2 PUB:(DE-HGF) |x After Call |
| 520 | _ | _ | |a The Fast Multipole Method (FMM) is an optimally efficient algorithm for solving N -body problems: a fundamental challenge in fields like astrophysics, plasma physics and molecular dynamics. It is particularly suited for computing 1/r potentials present in Coulomb and gravitational particle systems. Despite the near-field phase being trivially parallelisable, the far-field phase of the 1/r FMM currently lacks an efficient, massively parallel GPU algorithm fitting for the era of Exascale computing. Current state-of-the-art approaches either favor highly parallel but inefficient expansion shift operators or asymptotically efficient but poorly parallelisable rotation-based ones. Recently, a breakthrough was made with the re-evaluation of a rotation operator variant called fast rotation, which dramatically increases caching effectiveness and marries the advantages of both methods. Thus, this paper incorporates this approach to create fast rotation-based operators that facilitate an efficient far-field algorithm for the FMM on GPUs. Additionally, a warpcentric data access scheme is co-developed alongside a matching octree design, which yields coalesced memory access patterns for the bottleneck operators of the far-field phase. The fast rotation algorithm is enhanced with a cache-tiling mechanism, maximising GPU cache utilisation. Compared to the state-of-theart GPU FMM far-field implementation, our algorithm achieves lower running times across the board and a 2.47 x speedup for an increased precision simulation, with the performance improvement growing as precision increases, providing concrete proof of efficacy for dense particle systems. |
| 536 | _ | _ | |a 5112 - Cross-Domain Algorithms, Tools, Methods Labs (ATMLs) and Research Groups (POF4-511) |0 G:(DE-HGF)POF4-5112 |c POF4-511 |f POF IV |x 0 |
| 700 | 1 | _ | |a Dachsel, Holger |0 P:(DE-Juel1)132079 |b 1 |u fzj |
| 700 | 1 | _ | |a Morgenstern, Laura |0 P:(DE-Juel1)169856 |b 2 |
| 700 | 1 | _ | |a Kabadshow, Ivo |0 P:(DE-Juel1)132152 |b 3 |u fzj |
| 909 | C | O | |o oai:juser.fz-juelich.de:1049541 |p VDB |
| 910 | 1 | _ | |a Forschungszentrum Jülich |0 I:(DE-588b)5008462-8 |k FZJ |b 0 |6 P:(DE-Juel1)206763 |
| 910 | 1 | _ | |a Forschungszentrum Jülich |0 I:(DE-588b)5008462-8 |k FZJ |b 1 |6 P:(DE-Juel1)132079 |
| 910 | 1 | _ | |a Forschungszentrum Jülich |0 I:(DE-588b)5008462-8 |k FZJ |b 3 |6 P:(DE-Juel1)132152 |
| 913 | 1 | _ | |a DE-HGF |b Key Technologies |l Engineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action |1 G:(DE-HGF)POF4-510 |0 G:(DE-HGF)POF4-511 |3 G:(DE-HGF)POF4 |2 G:(DE-HGF)POF4-500 |4 G:(DE-HGF)POF |v Enabling Computational- & Data-Intensive Science and Engineering |9 G:(DE-HGF)POF4-5112 |x 0 |
| 914 | 1 | _ | |y 2025 |
| 920 | _ | _ | |l yes |
| 920 | 1 | _ | |0 I:(DE-Juel1)JSC-20090406 |k JSC |l Jülich Supercomputing Center |x 0 |
| 980 | _ | _ | |a conf |
| 980 | _ | _ | |a VDB |
| 980 | _ | _ | |a I:(DE-Juel1)JSC-20090406 |
| 980 | _ | _ | |a UNRESTRICTED |
| Library | Collection | CLSMajor | CLSMinor | Language | Author |
|---|