001044949 001__ 1044949
001044949 005__ 20251030202115.0
001044949 0247_ $$2doi$$a10.1007/s11227-025-07145-6
001044949 0247_ $$2ISSN$$a0920-8542
001044949 0247_ $$2ISSN$$a1573-0484
001044949 0247_ $$2datacite_doi$$a10.34734/FZJ-2025-03449
001044949 037__ $$aFZJ-2025-03449
001044949 082__ $$a620
001044949 1001_ $$0P:(DE-Juel1)156619$$aBaumeister, Paul F.$$b0$$eCorresponding author
001044949 245__ $$atfQMRgpu: a GPU-accelerated linear solver with block-sparse complex result matrix
001044949 260__ $$aDordrecht [u.a.]$$bSpringer Science + Business Media B.V$$c2025
001044949 3367_ $$2DRIVER$$aarticle
001044949 3367_ $$2DataCite$$aOutput Types/Journal article
001044949 3367_ $$0PUB:(DE-HGF)16$$2PUB:(DE-HGF)$$aJournal Article$$bjournal$$mjournal$$s1761835131_12063
001044949 3367_ $$2BibTeX$$aARTICLE
001044949 3367_ $$2ORCID$$aJOURNAL_ARTICLE
001044949 3367_ $$00$$2EndNote$$aJournal Article
001044949 520__ $$aWe present tfQMRgpu, a GPU-accelerated iterative linear solver based on the transpose-free quasi-minimal residual (tfQMR) method. Designed for large-scale electronic structure calculations, particularly in the context of Korringa–Kohn–Rostoker density functional theory, tfQMRgpu efficiently handles block-sparse complex matrices arising from multiple scattering theory. The solver exploits GPU parallelism to accelerate convergence while leveraging memory-efficient sparse storage formats. By unifying the solution of multiple right-hand side (RHS) block vectors, tfQMRgpu significantly improves throughput, demonstrating up to a speedup on modern GPUs. Additionally, we introduce a flexible implementation framework that supports both explicit matrix-based and matrix-free operator formulations, such as high-order finite-difference stencils for real-space grid-based Green function calculations. Benchmarks on various NVIDIA GPUs demonstrate the solver’s efficiency, in some cases achieving over 56% of peak floating-point performance for block-sparse matrix multiplications. tfQMRgpu is open-source, providing interfaces for C, C++, Fortran, Julia, and Python, making it a versatile tool for high-performance computing applications that can benefit from the unification of RHS problems.
001044949 536__ $$0G:(DE-HGF)POF4-5111$$a5111 - Domain-Specific Simulation & Data Life Cycle Labs (SDLs) and Research Groups (POF4-511)$$cPOF4-511$$fPOF IV$$x0
001044949 536__ $$0G:(DE-HGF)POF4-5112$$a5112 - Cross-Domain Algorithms, Tools, Methods Labs (ATMLs) and Research Groups (POF4-511)$$cPOF4-511$$fPOF IV$$x1
001044949 536__ $$0G:(DE-HGF)POF4-5122$$a5122 - Future Computing & Big Data Systems (POF4-512)$$cPOF4-512$$fPOF IV$$x2
001044949 536__ $$0G:(DE-Juel-1)DB001492$$aBMBF 01 1H1 6013, NRW 325 – 8.03 – 133340 - SiVeGCS (DB001492)$$cDB001492$$x3
001044949 536__ $$0G:(DE-Juel-1)ATML-X-DEV$$aATML-X-DEV - ATML Accelerating Devices (ATML-X-DEV)$$cATML-X-DEV$$x4
001044949 588__ $$aDataset connected to CrossRef, Journals: juser.fz-juelich.de
001044949 7001_ $$0P:(DE-Juel1)172888$$aNassyr, Stepan$$b1
001044949 773__ $$0PERI:(DE-600)1479917-0$$a10.1007/s11227-025-07145-6$$gVol. 81, no. 5, p. 663$$n5$$p663$$tThe journal of supercomputing$$v81$$x0920-8542$$y2025
001044949 8564_ $$uhttps://juser.fz-juelich.de/record/1044949/files/s11227-025-07145-6.pdf$$yOpenAccess
001044949 8767_ $$d2025-08-11$$eHybrid-OA$$jDEAL
001044949 909CO $$ooai:juser.fz-juelich.de:1044949$$popenaire$$popen_access$$pOpenAPC_DEAL$$pdriver$$pVDB$$popenCost$$pdnbdelivery
001044949 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)156619$$aForschungszentrum Jülich$$b0$$kFZJ
001044949 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)172888$$aForschungszentrum Jülich$$b1$$kFZJ
001044949 9131_ $$0G:(DE-HGF)POF4-511$$1G:(DE-HGF)POF4-510$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5111$$aDE-HGF$$bKey Technologies$$lEngineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action$$vEnabling Computational- & Data-Intensive Science and Engineering$$x0
001044949 9131_ $$0G:(DE-HGF)POF4-511$$1G:(DE-HGF)POF4-510$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5112$$aDE-HGF$$bKey Technologies$$lEngineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action$$vEnabling Computational- & Data-Intensive Science and Engineering$$x1
001044949 9131_ $$0G:(DE-HGF)POF4-512$$1G:(DE-HGF)POF4-510$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5122$$aDE-HGF$$bKey Technologies$$lEngineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action$$vSupercomputing & Big Data Infrastructures$$x2
001044949 9141_ $$y2025
001044949 915pc $$0PC:(DE-HGF)0000$$2APC$$aAPC keys set
001044949 915pc $$0PC:(DE-HGF)0113$$2APC$$aDEAL: Springer Nature 2020
001044949 915__ $$0StatID:(DE-HGF)0200$$2StatID$$aDBCoverage$$bSCOPUS$$d2024-12-18
001044949 915__ $$0StatID:(DE-HGF)0160$$2StatID$$aDBCoverage$$bEssential Science Indicators$$d2024-12-18
001044949 915__ $$0StatID:(DE-HGF)1160$$2StatID$$aDBCoverage$$bCurrent Contents - Engineering, Computing and Technology$$d2024-12-18
001044949 915__ $$0LIC:(DE-HGF)CCBY4$$2HGFVOC$$aCreative Commons Attribution CC BY 4.0
001044949 915__ $$0StatID:(DE-HGF)0600$$2StatID$$aDBCoverage$$bEbsco Academic Search$$d2024-12-18
001044949 915__ $$0StatID:(DE-HGF)0100$$2StatID$$aJCR$$bJ SUPERCOMPUT : 2022$$d2024-12-18
001044949 915__ $$0StatID:(DE-HGF)0113$$2StatID$$aWoS$$bScience Citation Index Expanded$$d2024-12-18
001044949 915__ $$0StatID:(DE-HGF)3002$$2StatID$$aDEAL Springer$$d2024-12-18$$wger
001044949 915__ $$0StatID:(DE-HGF)0150$$2StatID$$aDBCoverage$$bWeb of Science Core Collection$$d2024-12-18
001044949 915__ $$0StatID:(DE-HGF)9900$$2StatID$$aIF < 5$$d2024-12-18
001044949 915__ $$0StatID:(DE-HGF)0510$$2StatID$$aOpenAccess
001044949 915__ $$0StatID:(DE-HGF)0030$$2StatID$$aPeer Review$$bASC$$d2024-12-18
001044949 915__ $$0StatID:(DE-HGF)0300$$2StatID$$aDBCoverage$$bMedline$$d2024-12-18
001044949 915__ $$0StatID:(DE-HGF)0420$$2StatID$$aNationallizenz$$d2024-12-18$$wger
001044949 915__ $$0StatID:(DE-HGF)0199$$2StatID$$aDBCoverage$$bClarivate Analytics Master Journal List$$d2024-12-18
001044949 9201_ $$0I:(DE-Juel1)JSC-20090406$$kJSC$$lJülich Supercomputing Center$$x0
001044949 980__ $$ajournal
001044949 980__ $$aVDB
001044949 980__ $$aUNRESTRICTED
001044949 980__ $$aI:(DE-Juel1)JSC-20090406
001044949 980__ $$aAPC
001044949 9801_ $$aAPC
001044949 9801_ $$aFullTexts