Home > Publications database > Portable CPU implementation of Wilson, Brillouin and Susskind fermions in lattice QCD > print |
001 | 1019542 | ||
005 | 20240109115103.0 | ||
024 | 7 | _ | |a 10.1016/j.cpc.2022.108555 |2 doi |
024 | 7 | _ | |a 0010-4655 |2 ISSN |
024 | 7 | _ | |a 1386-9485 |2 ISSN |
024 | 7 | _ | |a 1879-2944 |2 ISSN |
024 | 7 | _ | |a 10.34734/FZJ-2023-05489 |2 datacite_doi |
024 | 7 | _ | |a WOS:000876219500004 |2 WOS |
037 | _ | _ | |a FZJ-2023-05489 |
041 | _ | _ | |a English |
082 | _ | _ | |a 530 |
100 | 1 | _ | |a Durr, Stephan |0 P:(DE-Juel1)132580 |b 0 |e Corresponding author |
245 | _ | _ | |a Portable CPU implementation of Wilson, Brillouin and Susskind fermions in lattice QCD |
260 | _ | _ | |a Amsterdam |c 2023 |b North Holland Publ. Co. |
336 | 7 | _ | |a article |2 DRIVER |
336 | 7 | _ | |a Output Types/Journal article |2 DataCite |
336 | 7 | _ | |a Journal Article |b journal |m journal |0 PUB:(DE-HGF)16 |s 1703074724_2977 |2 PUB:(DE-HGF) |
336 | 7 | _ | |a ARTICLE |2 BibTeX |
336 | 7 | _ | |a JOURNAL_ARTICLE |2 ORCID |
336 | 7 | _ | |a Journal Article |0 0 |2 EndNote |
520 | _ | _ | |a A modern Fortran implementation of three Dirac operators (Wilson, Brillouin, Susskind) in lattice QCD is presented, based on OpenMP shared-memory parallelization and SIMD pragmas.The main idea is to apply a Dirac operator to $N_v$ vectors simultaneously, to ease the memory bandwidth bottleneck.All index computations are left to the compiler and maximum weight is given to portability and flexibility.The lattice volume, $N_x N_y N_z N_t$, the number of colors, $N_c$, and the number of right-hand sides, $N_v$, are parameters defined at compile time.Several memory layout options are compared.The code performs well on modern many-core architectures (480\,Gflop/s, 880\,Gflop/s, and 780\,Gflop/s with $N_v=12$for the three operators in single precision on a 72-core KNL processor, a $2 \times 24$-core Skylake node yields similar results).Explicit run-time tests with CG/BiCGstab inverters confirm that the memory layout is relevant for the KNL, but less so for the Skylake architecture.The ancillary code distribution contains all routines, including the single, double, and mixed precision Krylov space solvers, to render it self-contained and ready-to-use. |
536 | _ | _ | |a 5111 - Domain-Specific Simulation & Data Life Cycle Labs (SDLs) and Research Groups (POF4-511) |0 G:(DE-HGF)POF4-5111 |c POF4-511 |f POF IV |x 0 |
536 | _ | _ | |a DFG project 448374536 - Fortschritte bei einer präzisen ab initio Bestimmung der Partonen-Struktur von Hadronen (448374536) |0 G:(GEPRIS)448374536 |c 448374536 |x 1 |
588 | _ | _ | |a Dataset connected to CrossRef, Journals: juser.fz-juelich.de |
773 | _ | _ | |a 10.1016/j.cpc.2022.108555 |g Vol. 282, p. 108555 - |0 PERI:(DE-600)1466511-6 |p 108555 - |t Computer physics communications |v 282 |y 2023 |x 0010-4655 |
856 | 4 | _ | |y Published on 2022-09-26. Available in OpenAccess from 2024-09-26. |u https://juser.fz-juelich.de/record/1019542/files/testknl_v3.pdf |
856 | 4 | _ | |y Published on 2022-09-26. Available in OpenAccess from 2024-09-26. |x icon |u https://juser.fz-juelich.de/record/1019542/files/testknl_v3.gif?subformat=icon |
856 | 4 | _ | |y Published on 2022-09-26. Available in OpenAccess from 2024-09-26. |x icon-1440 |u https://juser.fz-juelich.de/record/1019542/files/testknl_v3.jpg?subformat=icon-1440 |
856 | 4 | _ | |y Published on 2022-09-26. Available in OpenAccess from 2024-09-26. |x icon-180 |u https://juser.fz-juelich.de/record/1019542/files/testknl_v3.jpg?subformat=icon-180 |
856 | 4 | _ | |y Published on 2022-09-26. Available in OpenAccess from 2024-09-26. |x icon-640 |u https://juser.fz-juelich.de/record/1019542/files/testknl_v3.jpg?subformat=icon-640 |
909 | C | O | |o oai:juser.fz-juelich.de:1019542 |p openaire |p open_access |p VDB |p driver |p dnbdelivery |
910 | 1 | _ | |a Forschungszentrum Jülich |0 I:(DE-588b)5008462-8 |k FZJ |b 0 |6 P:(DE-Juel1)132580 |
913 | 1 | _ | |a DE-HGF |b Key Technologies |l Engineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action |1 G:(DE-HGF)POF4-510 |0 G:(DE-HGF)POF4-511 |3 G:(DE-HGF)POF4 |2 G:(DE-HGF)POF4-500 |4 G:(DE-HGF)POF |v Enabling Computational- & Data-Intensive Science and Engineering |9 G:(DE-HGF)POF4-5111 |x 0 |
914 | 1 | _ | |y 2023 |
915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)0150 |2 StatID |b Web of Science Core Collection |d 2023-08-25 |
915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)0300 |2 StatID |b Medline |d 2023-08-25 |
915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)0600 |2 StatID |b Ebsco Academic Search |d 2023-08-25 |
915 | _ | _ | |a Creative Commons Attribution-NonCommercial-NoDerivs CC BY-NC-ND 4.0 |0 LIC:(DE-HGF)CCBYNCND4 |2 HGFVOC |
915 | _ | _ | |a Embargoed OpenAccess |0 StatID:(DE-HGF)0530 |2 StatID |
915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)1150 |2 StatID |b Current Contents - Physical, Chemical and Earth Sciences |d 2023-08-25 |
915 | _ | _ | |a IF >= 5 |0 StatID:(DE-HGF)9905 |2 StatID |b COMPUT PHYS COMMUN : 2022 |d 2023-08-25 |
915 | _ | _ | |a WoS |0 StatID:(DE-HGF)0113 |2 StatID |b Science Citation Index Expanded |d 2023-08-25 |
915 | _ | _ | |a JCR |0 StatID:(DE-HGF)0100 |2 StatID |b COMPUT PHYS COMMUN : 2022 |d 2023-08-25 |
915 | _ | _ | |a No Peer Review |0 StatID:(DE-HGF)0020 |2 StatID |b ASC |d 2023-08-25 |
915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)0160 |2 StatID |b Essential Science Indicators |d 2023-08-25 |
915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)0200 |2 StatID |b SCOPUS |d 2023-08-25 |
915 | _ | _ | |a Nationallizenz |0 StatID:(DE-HGF)0420 |2 StatID |d 2023-08-25 |w ger |
915 | _ | _ | |a DBCoverage |0 StatID:(DE-HGF)0199 |2 StatID |b Clarivate Analytics Master Journal List |d 2023-08-25 |
920 | _ | _ | |l yes |
920 | 1 | _ | |0 I:(DE-Juel1)JSC-20090406 |k JSC |l Jülich Supercomputing Center |x 0 |
980 | _ | _ | |a journal |
980 | _ | _ | |a VDB |
980 | _ | _ | |a UNRESTRICTED |
980 | _ | _ | |a I:(DE-Juel1)JSC-20090406 |
980 | 1 | _ | |a FullTexts |
Library | Collection | CLSMajor | CLSMinor | Language | Author |
---|