001     150587
005     20210129213227.0
024 7 _ |a 10.1109/HPCC.2012.88
|2 doi
024 7 _ |a WOS:000310377500079
|2 WOS
037 _ _ |a FZJ-2014-00636
100 1 _ |a Dachsel, Holger
|0 P:(DE-Juel1)132079
|b 0
|u fzj
|e Corresponding author
111 2 _ |a 2012 IEEE 14th Int'l Conf. on High Performance Computing and Communication (HPCC) & 2012 IEEE 9th Int'l Conf. on Embedded Software and Systems (ICESS)
|c Liverpool
|d 2012-06-25 - 2012-06-27
|w United Kingdom
245 _ _ |a Automatic Tuning of the Fast Multipole Method Based on Integrated Performance Prediction
260 _ _ |c 2012
|b IEEE
295 1 0 |a 2012 IEEE 14th International Conference on High Performance Computing and Communication & 2012 IEEE 9th International Conference on Embedded Software and Systems
300 _ _ |a 617-624
336 7 _ |a Contribution to a conference proceedings
|b contrib
|m contrib
|0 PUB:(DE-HGF)8
|s 1391441314_8059
|2 PUB:(DE-HGF)
336 7 _ |a Contribution to a book
|0 PUB:(DE-HGF)7
|2 PUB:(DE-HGF)
|m contb
336 7 _ |a Conference Paper
|0 33
|2 EndNote
336 7 _ |a CONFERENCE_PAPER
|2 ORCID
336 7 _ |a Output Types/Conference Paper
|2 DataCite
336 7 _ |a conferenceObject
|2 DRIVER
336 7 _ |a INPROCEEDINGS
|2 BibTeX
520 _ _ |a The Fast Multipole Method (FMM) is an efficient, widely used method for the solution of N-body problems. One of the main data structures is a hierarchical tree data structure describing the separation into near-field and far-field particle interactions. This article presents a method for automatic tuning of the FMM by selecting the optimal FMM tree depth based on an integrated performance prediction of the FMM computations. The prediction method exploits benchmarking of significant parts of the FMM implementation to adapt the tuning to the specific hardware system being used. Furthermore, a separate analysis phase at runtime is used to predict the computational load caused by the specific particle system to be computed. The tuning method was integrated into an FMM implementation. Performance results show that a reliable determination of the tree depth is achieved, thus leading to minimal execution times of the FMM algorithm.
536 _ _ |a 411 - Computational Science and Mathematical Methods (POF2-411)
|0 G:(DE-HGF)POF2-411
|c POF2-411
|x 0
|f POF II
588 _ _ |a Dataset connected to CrossRef Conference
700 1 _ |a Hofmann, Michael
|0 P:(DE-HGF)0
|b 1
700 1 _ |a Lang, Jens
|0 P:(DE-HGF)0
|b 2
700 1 _ |a Runger, Gudula
|0 P:(DE-HGF)0
|b 3
773 _ _ |a 10.1109/HPCC.2012.88
909 C O |o oai:juser.fz-juelich.de:150587
|p VDB
910 1 _ |a Forschungszentrum Jülich GmbH
|0 I:(DE-588b)5008462-8
|k FZJ
|b 0
|6 P:(DE-Juel1)132079
913 2 _ |a DE-HGF
|b Key Technologies
|l Supercomputing & Big Data
|1 G:(DE-HGF)POF3-510
|0 G:(DE-HGF)POF3-511
|2 G:(DE-HGF)POF3-500
|v Computational Science and Mathematical Methods
|x 0
913 1 _ |a DE-HGF
|b Schlüsseltechnologien
|l Supercomputing
|1 G:(DE-HGF)POF2-410
|0 G:(DE-HGF)POF2-411
|2 G:(DE-HGF)POF2-400
|v Computational Science and Mathematical Methods
|x 0
|4 G:(DE-HGF)POF
|3 G:(DE-HGF)POF2
914 1 _ |y 2013
920 1 _ |0 I:(DE-Juel1)JSC-20090406
|k JSC
|l Jülich Supercomputing Center
|x 0
980 _ _ |a contrib
980 _ _ |a VDB
980 _ _ |a UNRESTRICTED
980 _ _ |a contb
980 _ _ |a I:(DE-Juel1)JSC-20090406


LibraryCollectionCLSMajorCLSMinorLanguageAuthor
Marc 21