Performance of a parallel matrix multiplication routine on Intel iPSC/860

Gutheil, Inge; Krotz-Vogel, Werner

doi:10.1016/0167-8191(94)90012-4

%0 Journal Article
%A Gutheil, Inge
%A Krotz-Vogel, Werner
%T Performance of a parallel matrix multiplication routine on Intel iPSC/860
%J Parallel computing
%V 20
%N 7
%@ 0167-8191
%C Amsterdam [u.a.]
%I North-Holland, Elsevier Science
%M FZJ-2015-01627
%P 953 - 974
%D 1994
%X The performance of a parallel matrix-matrix-multiplication routine with the same functionality as DGEMM of BLAS3 was tested for different numbers of nodes on a 32-node iPSC/860. The routine was then tunned for maximum performance on this particular computer system. Small changes in the original code lead to substantially higher performance and in all tested configurations there is a critical matrix size n≈50·np, the number of processor, above which Intel's non-blocking isend is more efficient than the blocking csend. This shows that special tuning for a single machine pays off for large matrices.
%F PUB:(DE-HGF)16
%9 Journal Article
%U <Go to ISI:>//WOS:A1994NY49800002
%R 10.1016/0167-8191(94)90012-4
%U https://juser.fz-juelich.de/record/188166

JuSER :: :: :: ::
Powered by v1.1.7 |
Verwaltet von

Impressum | Data Privacy Policy

Diese Seite gibt es auch in den folgenden Sprachen:
Deutsch English

Gast :: Anmelden JuSER
		Suchen		Absenden		Personalisieren Ihre Benachrichtigungen Ihre Körbe Ihre Suchanfragen		Hilfe