| Hauptseite > Publikationsdatenbank > tfQMRgpu: a GPU-accelerated linear solver with block-sparse complex result matrix |
| Journal Article | FZJ-2025-03449 |
;
2025
Springer Science + Business Media B.V
Dordrecht [u.a.]
This record in other databases:
Please use a persistent id in citations: doi:10.1007/s11227-025-07145-6 doi:10.34734/FZJ-2025-03449
Abstract: We present tfQMRgpu, a GPU-accelerated iterative linear solver based on the transpose-free quasi-minimal residual (tfQMR) method. Designed for large-scale electronic structure calculations, particularly in the context of Korringa–Kohn–Rostoker density functional theory, tfQMRgpu efficiently handles block-sparse complex matrices arising from multiple scattering theory. The solver exploits GPU parallelism to accelerate convergence while leveraging memory-efficient sparse storage formats. By unifying the solution of multiple right-hand side (RHS) block vectors, tfQMRgpu significantly improves throughput, demonstrating up to a speedup on modern GPUs. Additionally, we introduce a flexible implementation framework that supports both explicit matrix-based and matrix-free operator formulations, such as high-order finite-difference stencils for real-space grid-based Green function calculations. Benchmarks on various NVIDIA GPUs demonstrate the solver’s efficiency, in some cases achieving over 56% of peak floating-point performance for block-sparse matrix multiplications. tfQMRgpu is open-source, providing interfaces for C, C++, Fortran, Julia, and Python, making it a versatile tool for high-performance computing applications that can benefit from the unification of RHS problems.
|
The record appears in these collections: |