001     1044949
005     20251030202115.0
024 7 _ |a 10.1007/s11227-025-07145-6
|2 doi
024 7 _ |a 0920-8542
|2 ISSN
024 7 _ |a 1573-0484
|2 ISSN
024 7 _ |a 10.34734/FZJ-2025-03449
|2 datacite_doi
037 _ _ |a FZJ-2025-03449
082 _ _ |a 620
100 1 _ |a Baumeister, Paul F.
|0 P:(DE-Juel1)156619
|b 0
|e Corresponding author
245 _ _ |a tfQMRgpu: a GPU-accelerated linear solver with block-sparse complex result matrix
260 _ _ |a Dordrecht [u.a.]
|c 2025
|b Springer Science + Business Media B.V
336 7 _ |a article
|2 DRIVER
336 7 _ |a Output Types/Journal article
|2 DataCite
336 7 _ |a Journal Article
|b journal
|m journal
|0 PUB:(DE-HGF)16
|s 1761835131_12063
|2 PUB:(DE-HGF)
336 7 _ |a ARTICLE
|2 BibTeX
336 7 _ |a JOURNAL_ARTICLE
|2 ORCID
336 7 _ |a Journal Article
|0 0
|2 EndNote
520 _ _ |a We present tfQMRgpu, a GPU-accelerated iterative linear solver based on the transpose-free quasi-minimal residual (tfQMR) method. Designed for large-scale electronic structure calculations, particularly in the context of Korringa–Kohn–Rostoker density functional theory, tfQMRgpu efficiently handles block-sparse complex matrices arising from multiple scattering theory. The solver exploits GPU parallelism to accelerate convergence while leveraging memory-efficient sparse storage formats. By unifying the solution of multiple right-hand side (RHS) block vectors, tfQMRgpu significantly improves throughput, demonstrating up to a speedup on modern GPUs. Additionally, we introduce a flexible implementation framework that supports both explicit matrix-based and matrix-free operator formulations, such as high-order finite-difference stencils for real-space grid-based Green function calculations. Benchmarks on various NVIDIA GPUs demonstrate the solver’s efficiency, in some cases achieving over 56% of peak floating-point performance for block-sparse matrix multiplications. tfQMRgpu is open-source, providing interfaces for C, C++, Fortran, Julia, and Python, making it a versatile tool for high-performance computing applications that can benefit from the unification of RHS problems.
536 _ _ |a 5111 - Domain-Specific Simulation & Data Life Cycle Labs (SDLs) and Research Groups (POF4-511)
|0 G:(DE-HGF)POF4-5111
|c POF4-511
|f POF IV
|x 0
536 _ _ |a 5112 - Cross-Domain Algorithms, Tools, Methods Labs (ATMLs) and Research Groups (POF4-511)
|0 G:(DE-HGF)POF4-5112
|c POF4-511
|f POF IV
|x 1
536 _ _ |a 5122 - Future Computing & Big Data Systems (POF4-512)
|0 G:(DE-HGF)POF4-5122
|c POF4-512
|f POF IV
|x 2
536 _ _ |a BMBF 01 1H1 6013, NRW 325 – 8.03 – 133340 - SiVeGCS (DB001492)
|0 G:(DE-Juel-1)DB001492
|c DB001492
|x 3
536 _ _ |a ATML-X-DEV - ATML Accelerating Devices (ATML-X-DEV)
|0 G:(DE-Juel-1)ATML-X-DEV
|c ATML-X-DEV
|x 4
588 _ _ |a Dataset connected to CrossRef, Journals: juser.fz-juelich.de
700 1 _ |a Nassyr, Stepan
|0 P:(DE-Juel1)172888
|b 1
773 _ _ |a 10.1007/s11227-025-07145-6
|g Vol. 81, no. 5, p. 663
|0 PERI:(DE-600)1479917-0
|n 5
|p 663
|t The journal of supercomputing
|v 81
|y 2025
|x 0920-8542
856 4 _ |u https://juser.fz-juelich.de/record/1044949/files/s11227-025-07145-6.pdf
|y OpenAccess
909 C O |o oai:juser.fz-juelich.de:1044949
|p openaire
|p open_access
|p OpenAPC_DEAL
|p driver
|p VDB
|p openCost
|p dnbdelivery
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 0
|6 P:(DE-Juel1)156619
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 1
|6 P:(DE-Juel1)172888
913 1 _ |a DE-HGF
|b Key Technologies
|l Engineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action
|1 G:(DE-HGF)POF4-510
|0 G:(DE-HGF)POF4-511
|3 G:(DE-HGF)POF4
|2 G:(DE-HGF)POF4-500
|4 G:(DE-HGF)POF
|v Enabling Computational- & Data-Intensive Science and Engineering
|9 G:(DE-HGF)POF4-5111
|x 0
913 1 _ |a DE-HGF
|b Key Technologies
|l Engineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action
|1 G:(DE-HGF)POF4-510
|0 G:(DE-HGF)POF4-511
|3 G:(DE-HGF)POF4
|2 G:(DE-HGF)POF4-500
|4 G:(DE-HGF)POF
|v Enabling Computational- & Data-Intensive Science and Engineering
|9 G:(DE-HGF)POF4-5112
|x 1
913 1 _ |a DE-HGF
|b Key Technologies
|l Engineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action
|1 G:(DE-HGF)POF4-510
|0 G:(DE-HGF)POF4-512
|3 G:(DE-HGF)POF4
|2 G:(DE-HGF)POF4-500
|4 G:(DE-HGF)POF
|v Supercomputing & Big Data Infrastructures
|9 G:(DE-HGF)POF4-5122
|x 2
914 1 _ |y 2025
915 p c |a APC keys set
|0 PC:(DE-HGF)0000
|2 APC
915 p c |a DEAL: Springer Nature 2020
|0 PC:(DE-HGF)0113
|2 APC
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0200
|2 StatID
|b SCOPUS
|d 2024-12-18
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0160
|2 StatID
|b Essential Science Indicators
|d 2024-12-18
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)1160
|2 StatID
|b Current Contents - Engineering, Computing and Technology
|d 2024-12-18
915 _ _ |a Creative Commons Attribution CC BY 4.0
|0 LIC:(DE-HGF)CCBY4
|2 HGFVOC
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0600
|2 StatID
|b Ebsco Academic Search
|d 2024-12-18
915 _ _ |a JCR
|0 StatID:(DE-HGF)0100
|2 StatID
|b J SUPERCOMPUT : 2022
|d 2024-12-18
915 _ _ |a WoS
|0 StatID:(DE-HGF)0113
|2 StatID
|b Science Citation Index Expanded
|d 2024-12-18
915 _ _ |a DEAL Springer
|0 StatID:(DE-HGF)3002
|2 StatID
|d 2024-12-18
|w ger
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0150
|2 StatID
|b Web of Science Core Collection
|d 2024-12-18
915 _ _ |a IF < 5
|0 StatID:(DE-HGF)9900
|2 StatID
|d 2024-12-18
915 _ _ |a OpenAccess
|0 StatID:(DE-HGF)0510
|2 StatID
915 _ _ |a Peer Review
|0 StatID:(DE-HGF)0030
|2 StatID
|b ASC
|d 2024-12-18
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0300
|2 StatID
|b Medline
|d 2024-12-18
915 _ _ |a Nationallizenz
|0 StatID:(DE-HGF)0420
|2 StatID
|d 2024-12-18
|w ger
915 _ _ |a DBCoverage
|0 StatID:(DE-HGF)0199
|2 StatID
|b Clarivate Analytics Master Journal List
|d 2024-12-18
920 1 _ |0 I:(DE-Juel1)JSC-20090406
|k JSC
|l Jülich Supercomputing Center
|x 0
980 _ _ |a journal
980 _ _ |a VDB
980 _ _ |a UNRESTRICTED
980 _ _ |a I:(DE-Juel1)JSC-20090406
980 _ _ |a APC
980 1 _ |a APC
980 1 _ |a FullTexts


LibraryCollectionCLSMajorCLSMinorLanguageAuthor
Marc 21