TY  - CONF
AU  - Lippert, Th.
AU  - Petkov, N.
AU  - Schilling, K.
A3  - Hertzberger, Bob
A3  - Sloot, Peter
TI  - BLAS-3 for the quadrics parallel computer
VL  - 1225
CY  - Berlin, Heidelberg
PB  - Springer Berlin Heidelberg
M1  - FZJ-2019-01120
SN  - 978-3-540-62898-9 (print)
T2  - Lecture Notes in Computer Science
SP  - 332 - 341
PY  - 1997
AB  - A scalable parallel algorithm for matrix multiplication on SISAMD computers is presented. Our method enables us to implement an efficient BLAS library on the Italian APE100/Quadrics SISAMD massively parallel computer on which hitherto scalable parallel BLAS-3 were not available. The approach proposed is based on a one-dimensional ring connectivity. The flow of data is hyper-systolic. The communication overhead is competitive with that of established algorithms for SIMD and MIMD machines. Advantages are that (i) the layout of the matrices is preserved during the computation, (ii) BLAS-2 fit well into this layout and (iii) indexed addressing is avoided, which renders the algorithm suitable for SISAMD machines and, in this way, for all other types of parallel computers. On the APE100/Quadrics, a performance of nearly 25 % of the peak performance for multiplications of complex matrices is achieved.
T2  - International Conference on High-Performance Computing and Networking
CY  - 28 Apr 1997 - 30 Apr 1997, Vienna (Austria)
Y2  - 28 Apr 1997 - 30 Apr 1997
M2  - Vienna, Austria
LB  - PUB:(DE-HGF)8 ; PUB:(DE-HGF)7
DO  - DOI:10.1007/BFb0031605
UR  - https://juser.fz-juelich.de/record/860345
ER  -