001     907602
005     20221109161719.0
024 7 _ |a arXiv:2205.02491
|2 arXiv
024 7 _ |a 10.1145/3539781.3539792
|2 doi
024 7 _ |a 2128/31597
|2 Handle
037 _ _ |a FZJ-2022-02101
100 1 _ |a Wu, Xinzhe
|0 P:(DE-Juel1)178969
|b 0
|u fzj
111 2 _ |a Platform for Advanced Scientific Computing
|g PASC22
|c Basel
|d 2022-06-27 - 2022-06-29
|w Switzerland
245 _ _ |a ChASE - A Distributed Hybrid CPU-GPU Eigensolver for Large-scale Hermitian Eigenvalue Problems
260 _ _ |c 2022
|b ACM New York, NY, USA
295 1 0 |a Proceedings of the Platform for Advanced Scientific Computing Conference - ACM New York, NY, USA, 2022. - ISBN 9781450394109 - doi:10.1145/3539781.3539792
300 _ _ |a 12 pages
336 7 _ |a CONFERENCE_PAPER
|2 ORCID
336 7 _ |a Conference Paper
|0 33
|2 EndNote
336 7 _ |a INPROCEEDINGS
|2 BibTeX
336 7 _ |a conferenceObject
|2 DRIVER
336 7 _ |a Output Types/Conference Paper
|2 DataCite
336 7 _ |a Contribution to a conference proceedings
|b contrib
|m contrib
|0 PUB:(DE-HGF)8
|s 1659093693_31246
|2 PUB:(DE-HGF)
336 7 _ |a Contribution to a book
|0 PUB:(DE-HGF)7
|2 PUB:(DE-HGF)
|m contb
520 _ _ |a As modern massively parallel clusters are getting larger with beefier compute nodes, traditional parallel eigensolvers, such as direct solvers, struggle keeping the pace with the hardware evolution and being able to scale efficiently due to additional layers of communication and synchronization. This difficulty is especially important when porting traditional libraries to heterogeneous computing architectures equipped with accelerators, such as Graphics Processing Unit (GPU). Recently, there have been significant scientific contributions to the development of filter-based subspace eigensolver to compute partial eigenspectrum. The simpler structure of these type of algorithms makes for them easier to avoid the communication and synchronization bottlenecks typical of direct solvers. The Chebyshev Accelerated Subspace Eigensolver (ChASE) is a modern subspace eigensolver to compute partial extremal eigenpairs of large-scale Hermitian eigenproblems with the acceleration of a filter based on Chebyshev polynomials. In this work, we extend our previous work on ChASE by adding support for distributed hybrid CPU-multi-GPU computing architectures. Our tests show that ChASE achieves very good scaling performance up to 144 nodes with 526 NVIDIA A100 GPUs in total on dense eigenproblems of size up to $360$k.
536 _ _ |a 5111 - Domain-Specific Simulation & Data Life Cycle Labs (SDLs) and Research Groups (POF4-511)
|0 G:(DE-HGF)POF4-5111
|c POF4-511
|f POF IV
|x 0
536 _ _ |a PRACE-6IP - PRACE 6th Implementation Phase Project (823767)
|0 G:(EU-Grant)823767
|c 823767
|f H2020-INFRAEDI-2018-1
|x 1
536 _ _ |a Simulation and Data Laboratory Quantum Materials (SDLQM) (SDLQM)
|0 G:(DE-Juel1)SDLQM
|c SDLQM
|f Simulation and Data Laboratory Quantum Materials (SDLQM)
|x 2
588 _ _ |a Dataset connected to DataCite
700 1 _ |a Davidovic, Davor
|0 P:(DE-HGF)0
|b 1
700 1 _ |a Achilles, Sebastian
|0 P:(DE-Juel1)169552
|b 2
|u fzj
700 1 _ |a Di Napoli, Edoardo
|0 P:(DE-Juel1)144723
|b 3
|e Corresponding author
|u fzj
773 _ _ |a 10.1145/3539781.3539792
|p Article No.: 9
856 4 _ |u https://juser.fz-juelich.de/record/907602/files/2205.02491.pdf
|y OpenAccess
909 C O |o oai:juser.fz-juelich.de:907602
|p openaire
|p open_access
|p driver
|p VDB
|p ec_fundedresources
|p dnbdelivery
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 0
|6 P:(DE-Juel1)178969
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 2
|6 P:(DE-Juel1)169552
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 3
|6 P:(DE-Juel1)144723
913 1 _ |a DE-HGF
|b Key Technologies
|l Engineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action
|1 G:(DE-HGF)POF4-510
|0 G:(DE-HGF)POF4-511
|3 G:(DE-HGF)POF4
|2 G:(DE-HGF)POF4-500
|4 G:(DE-HGF)POF
|v Enabling Computational- & Data-Intensive Science and Engineering
|9 G:(DE-HGF)POF4-5111
|x 0
914 1 _ |y 2022
915 _ _ |a OpenAccess
|0 StatID:(DE-HGF)0510
|2 StatID
920 1 _ |0 I:(DE-Juel1)JSC-20090406
|k JSC
|l Jülich Supercomputing Center
|x 0
980 _ _ |a contrib
980 _ _ |a VDB
980 _ _ |a UNRESTRICTED
980 _ _ |a contb
980 _ _ |a I:(DE-Juel1)JSC-20090406
980 1 _ |a FullTexts


LibraryCollectionCLSMajorCLSMinorLanguageAuthor
Marc 21