Home > Publications database > Portable Node-Level Performance Optimization for the Fast Multipole Method |
Contribution to a conference proceedings | FZJ-2016-00642 |
;
2015
Springer International Publishing
Cham
ISBN: 978-3-319-22996-6, 978-3-319-22997-3 (electronic)
This record in other databases:
Please use a persistent id in citations: doi:10.1007/978-3-319-22997-3_2
Abstract: This article provides an in-depth analysis and high-level C++ optimization strategies for the most time-consuming kernels of a Fast Multipole Method (FMM). The two main kernels of a Coulomb FMM are formulated to support different hardware features, such as unrolling, vectorization or threading without the need to rewrite the kernels in intrinsics or even assembly. The abstract description of the algorithm automatically allows optimal node-level peak performance on a broad class of available hardware platforms. Most of the presented optimization schemes allow a generic, hence platform-independent description for other kernels as well.
![]() |
The record appears in these collections: |