Journal Article FZJ-2015-01627

http://join2-wiki.gsi.de/foswiki/pub/Main/Artwork/join2_logo100x88.png
Performance of a parallel matrix multiplication routine on Intel iPSC/860

 ;

1994
North-Holland, Elsevier Science Amsterdam [u.a.]

Parallel computing 20(7), 953 - 974 () [10.1016/0167-8191(94)90012-4]

This record in other databases:  

Please use a persistent id in citations:   doi:

Abstract: The performance of a parallel matrix-matrix-multiplication routine with the same functionality as DGEMM of BLAS3 was tested for different numbers of nodes on a 32-node iPSC/860. The routine was then tunned for maximum performance on this particular computer system. Small changes in the original code lead to substantially higher performance and in all tested configurations there is a critical matrix size n≈50·np, the number of processor, above which Intel's non-blocking isend is more efficient than the blocking csend. This shows that special tuning for a single machine pays off for large matrices.

Classification:

Contributing Institute(s):
  1. Zentralinstitut für Angewandte Mathematik (ZAM)
  2. Jülich Supercomputing Center (JSC)
Research Program(s):
  1. 899 - ohne Topic (POF2-899) (POF2-899)

Database coverage:
Medline ; OpenAccess ; Current Contents - Engineering, Computing and Technology ; IF < 5 ; JCR ; SCOPUS ; Science Citation Index Expanded ; Thomson Reuters Master Journal List ; Web of Science Core Collection
Click to display QR Code for this record

The record appears in these collections:
Dokumenttypen > Aufsätze > Zeitschriftenaufsätze
Workflowsammlungen > Öffentliche Einträge
Institutssammlungen > JSC
Publikationsdatenbank
Open Access

 Datensatz erzeugt am 2015-02-26, letzte Änderung am 2021-01-29


Dieses Dokument bewerten:

Rate this document:
1
2
3
 
(Bisher nicht rezensiert)