Home > Publications database > Portable Node-Level Performance Optimization for the Fast Multipole Method > print |
001 | 280931 | ||
005 | 20210129221443.0 | ||
020 | _ | _ | |a 978-3-319-22996-6 |
020 | _ | _ | |a 978-3-319-22997-3 (electronic) |
024 | 7 | _ | |a 10.1007/978-3-319-22997-3_2 |2 doi |
037 | _ | _ | |a FZJ-2016-00642 |
041 | _ | _ | |a English |
082 | _ | _ | |a 510 |
100 | 1 | _ | |a Beckmann, Andreas |0 P:(DE-Juel1)157750 |b 0 |e Corresponding author |
111 | 2 | _ | |a 3rd International Workshop on Computational Engineering |g CE 2014 |c Stuttgart |d 2014-10-06 - 2014-10-10 |w Germany |
245 | _ | _ | |a Portable Node-Level Performance Optimization for the Fast Multipole Method |
260 | _ | _ | |a Cham |c 2015 |b Springer International Publishing |
295 | 1 | 0 | |a Recent Trends in Computational Engineering - CE2014 |
300 | _ | _ | |a 29 - 46 |
336 | 7 | _ | |a Contribution to a conference proceedings |b contrib |m contrib |0 PUB:(DE-HGF)8 |s 1453379840_2884 |2 PUB:(DE-HGF) |
336 | 7 | _ | |a Conference Paper |0 33 |2 EndNote |
336 | 7 | _ | |a CONFERENCE_PAPER |2 ORCID |
336 | 7 | _ | |a Output Types/Conference Paper |2 DataCite |
336 | 7 | _ | |a conferenceObject |2 DRIVER |
336 | 7 | _ | |a INPROCEEDINGS |2 BibTeX |
490 | 0 | _ | |a Lecture Notes in Computational Science and Engineering |v 105 |
520 | _ | _ | |a This article provides an in-depth analysis and high-level C++ optimization strategies for the most time-consuming kernels of a Fast Multipole Method (FMM). The two main kernels of a Coulomb FMM are formulated to support different hardware features, such as unrolling, vectorization or threading without the need to rewrite the kernels in intrinsics or even assembly. The abstract description of the algorithm automatically allows optimal node-level peak performance on a broad class of available hardware platforms. Most of the presented optimization schemes allow a generic, hence platform-independent description for other kernels as well. |
536 | _ | _ | |a 511 - Computational Science and Mathematical Methods (POF3-511) |0 G:(DE-HGF)POF3-511 |c POF3-511 |f POF III |x 0 |
536 | _ | _ | |0 G:(GEPRIS)230673686 |x 1 |c 230673686 |a GromEx - Highly Scalable Unified Long-Range Electrostatics and Flexible Ionization for Realistic Biomolecular Simulations on the Exascale (230673686) |
588 | _ | _ | |a Dataset connected to CrossRef Book Series |
700 | 1 | _ | |a Kabadshow, Ivo |0 P:(DE-Juel1)132152 |b 1 |
773 | _ | _ | |a 10.1007/978-3-319-22997-3_2 |
909 | C | O | |o oai:juser.fz-juelich.de:280931 |p VDB |
910 | 1 | _ | |a Forschungszentrum Jülich GmbH |0 I:(DE-588b)5008462-8 |k FZJ |b 0 |6 P:(DE-Juel1)157750 |
910 | 1 | _ | |a Forschungszentrum Jülich GmbH |0 I:(DE-588b)5008462-8 |k FZJ |b 1 |6 P:(DE-Juel1)132152 |
913 | 1 | _ | |a DE-HGF |b Key Technologies |1 G:(DE-HGF)POF3-510 |0 G:(DE-HGF)POF3-511 |2 G:(DE-HGF)POF3-500 |v Computational Science and Mathematical Methods |x 0 |4 G:(DE-HGF)POF |3 G:(DE-HGF)POF3 |l Supercomputing & Big Data |
914 | 1 | _ | |y 2015 |
915 | _ | _ | |a No Authors Fulltext |0 StatID:(DE-HGF)0550 |2 StatID |
920 | _ | _ | |l yes |
920 | 1 | _ | |0 I:(DE-Juel1)JSC-20090406 |k JSC |l Jülich Supercomputing Center |x 0 |
980 | _ | _ | |a contrib |
980 | _ | _ | |a VDB |
980 | _ | _ | |a UNRESTRICTED |
980 | _ | _ | |a I:(DE-Juel1)JSC-20090406 |
Library | Collection | CLSMajor | CLSMinor | Language | Author |
---|