001     281229
005     20210129221650.0
024 7 _ |a 10.1002/cpe.3552
|2 doi
024 7 _ |a 1040-3108
|2 ISSN
024 7 _ |a 1096-9128
|2 ISSN
024 7 _ |a 1532-0626
|2 ISSN
024 7 _ |a 1532-0634
|2 ISSN
024 7 _ |a WOS:000376263300002
|2 WOS
037 _ _ |a FZJ-2016-00928
082 _ _ |a 004
100 1 _ |a Alvarez Mallon, Damian
|0 P:(DE-Juel1)144660
|b 0
|e Corresponding author
|u fzj
245 _ _ |a MPI and UPC broadcast, scatter and gather algorithms in Xeon Phi
260 _ _ |a Chichester
|c 2016
|b Wiley
336 7 _ |a article
|2 DRIVER
336 7 _ |a Output Types/Journal article
|2 DataCite
336 7 _ |a Journal Article
|b journal
|m journal
|0 PUB:(DE-HGF)16
|s 1462946632_5389
|2 PUB:(DE-HGF)
336 7 _ |a ARTICLE
|2 BibTeX
336 7 _ |a JOURNAL_ARTICLE
|2 ORCID
336 7 _ |a Journal Article
|0 0
|2 EndNote
520 _ _ |a Accelerators have revolutionised the high performance computing (HPC) community. Despite their advantages, their very specific programming models and limited communication capabilities have kept them in a supporting role of the main processors. With the introduction of Xeon Phi, this is no longer true, as it can be programmed as the main processor and has direct access to the InfiniBand network adapter. Collective operations play a key role in many HPC applications. Therefore, studying its behaviour in the context of manycore coprocessors has great importance. This work analyses the performance of different algorithms for broadcast, scatter and gather, in a large-scale Xeon Phi supercomputer. The algorithms evaluated are those available in the reference message passing interface (MPI) implementation for Xeon Phi (Intel MPI), the default algorithm in an optimised MPI implementation (MVAPICH2-MIC), and a new set of algorithms, developed by the authors of this work, designed with modern processors and new communication features in mind. The latter are implemented in Unified Parallel C (UPC), a partitioned global address space language, leveraging one-sided communications, hierarchical trees and message pipelining. This study scales the experiments to 15360 cores in the Stampede supercomputer and compares the results to Xeon and hybrid Xeon + Xeon Phi experiments, with up to 19456 cores.
536 _ _ |a 513 - Supercomputer Facility (POF3-513)
|0 G:(DE-HGF)POF3-513
|c POF3-513
|f POF III
|x 0
588 _ _ |a Dataset connected to CrossRef
700 1 _ |a Taboada, Guillermo L.
|0 P:(DE-HGF)0
|b 1
700 1 _ |a Koesterke, Lars
|0 P:(DE-HGF)0
|b 2
770 _ _ |a Special Issue on Heterogeneous and Unconventional Cluster Architectures and Applications
773 _ _ |a 10.1002/cpe.3552
|g p. n/a - n/a
|0 PERI:(DE-600)2052606-4
|n 8
|p 2322–2340
|t Concurrency and computation
|v 28
|y 2016
|x 1532-0626
909 C O |o oai:juser.fz-juelich.de:281229
|p VDB
910 1 _ |a Forschungszentrum Jülich GmbH
|0 I:(DE-588b)5008462-8
|k FZJ
|b 0
|6 P:(DE-Juel1)144660
910 1 _ |a External Institute
|0 I:(DE-HGF)0
|k Extern
|b 1
|6 P:(DE-HGF)0
910 1 _ |a External Institute
|0 I:(DE-HGF)0
|k Extern
|b 2
|6 P:(DE-HGF)0
913 1 _ |a DE-HGF
|b Key Technologies
|1 G:(DE-HGF)POF3-510
|0 G:(DE-HGF)POF3-513
|2 G:(DE-HGF)POF3-500
|v Supercomputer Facility
|x 0
|4 G:(DE-HGF)POF
|3 G:(DE-HGF)POF3
|l Supercomputing & Big Data
914 1 _ |y 2016
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0200
|2 StatID
|b SCOPUS
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)1160
|2 StatID
|b Current Contents - Engineering, Computing and Technology
915 _ _ |a JCR
|0 StatID:(DE-HGF)0100
|2 StatID
|b CONCURR COMP-PRACT E : 2014
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0150
|2 StatID
|b Web of Science Core Collection
915 _ _ |a WoS
|0 StatID:(DE-HGF)0111
|2 StatID
|b Science Citation Index Expanded
915 _ _ |a IF < 5
|0 StatID:(DE-HGF)9900
|2 StatID
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0300
|2 StatID
|b Medline
915 _ _ |a No Authors Fulltext
|0 StatID:(DE-HGF)0550
|2 StatID
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0199
|2 StatID
|b Thomson Reuters Master Journal List
920 _ _ |l yes
920 1 _ |0 I:(DE-Juel1)JSC-20090406
|k JSC
|l Jülich Supercomputing Center
|x 0
980 _ _ |a journal
980 _ _ |a VDB
980 _ _ |a UNRESTRICTED
980 _ _ |a I:(DE-Juel1)JSC-20090406


LibraryCollectionCLSMajorCLSMinorLanguageAuthor
Marc 21