001019542 001__ 1019542
001019542 005__ 20240109115103.0
001019542 0247_ $$2doi$$a10.1016/j.cpc.2022.108555
001019542 0247_ $$2ISSN$$a0010-4655
001019542 0247_ $$2ISSN$$a1386-9485
001019542 0247_ $$2ISSN$$a1879-2944
001019542 0247_ $$2datacite_doi$$a10.34734/FZJ-2023-05489
001019542 0247_ $$2WOS$$aWOS:000876219500004
001019542 037__ $$aFZJ-2023-05489
001019542 041__ $$aEnglish
001019542 082__ $$a530
001019542 1001_ $$0P:(DE-Juel1)132580$$aDurr, Stephan$$b0$$eCorresponding author
001019542 245__ $$aPortable CPU implementation of Wilson, Brillouin and Susskind fermions in lattice QCD
001019542 260__ $$aAmsterdam$$bNorth Holland Publ. Co.$$c2023
001019542 3367_ $$2DRIVER$$aarticle
001019542 3367_ $$2DataCite$$aOutput Types/Journal article
001019542 3367_ $$0PUB:(DE-HGF)16$$2PUB:(DE-HGF)$$aJournal Article$$bjournal$$mjournal$$s1703074724_2977
001019542 3367_ $$2BibTeX$$aARTICLE
001019542 3367_ $$2ORCID$$aJOURNAL_ARTICLE
001019542 3367_ $$00$$2EndNote$$aJournal Article
001019542 520__ $$aA modern Fortran implementation of three Dirac operators (Wilson, Brillouin, Susskind) in lattice QCD is presented, based on OpenMP shared-memory parallelization and SIMD pragmas.The main idea is to apply a Dirac operator to $N_v$ vectors simultaneously, to ease the memory bandwidth bottleneck.All index computations are left to the compiler and maximum weight is given to portability and flexibility.The lattice volume, $N_x N_y N_z N_t$, the number of colors, $N_c$, and the number of right-hand sides, $N_v$, are parameters defined at compile time.Several memory layout options are compared.The code performs well on modern many-core architectures (480\,Gflop/s, 880\,Gflop/s, and 780\,Gflop/s with $N_v=12$for the three operators in single precision on a 72-core KNL processor, a $2 \times 24$-core Skylake node yields similar results).Explicit run-time tests with CG/BiCGstab inverters confirm that the memory layout is relevant for the KNL, but less so for the Skylake architecture.The ancillary code distribution contains all routines, including the single, double, and mixed precision Krylov space solvers, to render it self-contained and ready-to-use.
001019542 536__ $$0G:(DE-HGF)POF4-5111$$a5111 - Domain-Specific Simulation & Data Life Cycle Labs (SDLs) and Research Groups (POF4-511)$$cPOF4-511$$fPOF IV$$x0
001019542 536__ $$0G:(GEPRIS)448374536$$aDFG project 448374536 - Fortschritte bei einer präzisen ab initio Bestimmung der Partonen-Struktur von Hadronen (448374536)$$c448374536$$x1
001019542 588__ $$aDataset connected to CrossRef, Journals: juser.fz-juelich.de
001019542 773__ $$0PERI:(DE-600)1466511-6$$a10.1016/j.cpc.2022.108555$$gVol. 282, p. 108555 -$$p108555 -$$tComputer physics communications$$v282$$x0010-4655$$y2023
001019542 8564_ $$uhttps://juser.fz-juelich.de/record/1019542/files/testknl_v3.pdf$$yPublished on 2022-09-26. Available in OpenAccess from 2024-09-26.
001019542 8564_ $$uhttps://juser.fz-juelich.de/record/1019542/files/testknl_v3.gif?subformat=icon$$xicon$$yPublished on 2022-09-26. Available in OpenAccess from 2024-09-26.
001019542 8564_ $$uhttps://juser.fz-juelich.de/record/1019542/files/testknl_v3.jpg?subformat=icon-1440$$xicon-1440$$yPublished on 2022-09-26. Available in OpenAccess from 2024-09-26.
001019542 8564_ $$uhttps://juser.fz-juelich.de/record/1019542/files/testknl_v3.jpg?subformat=icon-180$$xicon-180$$yPublished on 2022-09-26. Available in OpenAccess from 2024-09-26.
001019542 8564_ $$uhttps://juser.fz-juelich.de/record/1019542/files/testknl_v3.jpg?subformat=icon-640$$xicon-640$$yPublished on 2022-09-26. Available in OpenAccess from 2024-09-26.
001019542 909CO $$ooai:juser.fz-juelich.de:1019542$$pdnbdelivery$$pdriver$$pVDB$$popen_access$$popenaire
001019542 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)132580$$aForschungszentrum Jülich$$b0$$kFZJ
001019542 9131_ $$0G:(DE-HGF)POF4-511$$1G:(DE-HGF)POF4-510$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5111$$aDE-HGF$$bKey Technologies$$lEngineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action$$vEnabling Computational- & Data-Intensive Science and Engineering$$x0
001019542 9141_ $$y2023
001019542 915__ $$0StatID:(DE-HGF)0150$$2StatID$$aDBCoverage$$bWeb of Science Core Collection$$d2023-08-25
001019542 915__ $$0StatID:(DE-HGF)0300$$2StatID$$aDBCoverage$$bMedline$$d2023-08-25
001019542 915__ $$0StatID:(DE-HGF)0600$$2StatID$$aDBCoverage$$bEbsco Academic Search$$d2023-08-25
001019542 915__ $$0LIC:(DE-HGF)CCBYNCND4$$2HGFVOC$$aCreative Commons Attribution-NonCommercial-NoDerivs CC BY-NC-ND 4.0
001019542 915__ $$0StatID:(DE-HGF)0530$$2StatID$$aEmbargoed OpenAccess
001019542 915__ $$0StatID:(DE-HGF)1150$$2StatID$$aDBCoverage$$bCurrent Contents - Physical, Chemical and Earth Sciences$$d2023-08-25
001019542 915__ $$0StatID:(DE-HGF)9905$$2StatID$$aIF >= 5$$bCOMPUT PHYS COMMUN : 2022$$d2023-08-25
001019542 915__ $$0StatID:(DE-HGF)0113$$2StatID$$aWoS$$bScience Citation Index Expanded$$d2023-08-25
001019542 915__ $$0StatID:(DE-HGF)0100$$2StatID$$aJCR$$bCOMPUT PHYS COMMUN : 2022$$d2023-08-25
001019542 915__ $$0StatID:(DE-HGF)0020$$2StatID$$aNo Peer Review$$bASC$$d2023-08-25
001019542 915__ $$0StatID:(DE-HGF)0160$$2StatID$$aDBCoverage$$bEssential Science Indicators$$d2023-08-25
001019542 915__ $$0StatID:(DE-HGF)0200$$2StatID$$aDBCoverage$$bSCOPUS$$d2023-08-25
001019542 915__ $$0StatID:(DE-HGF)0420$$2StatID$$aNationallizenz$$d2023-08-25$$wger
001019542 915__ $$0StatID:(DE-HGF)0199$$2StatID$$aDBCoverage$$bClarivate Analytics Master Journal List$$d2023-08-25
001019542 920__ $$lyes
001019542 9201_ $$0I:(DE-Juel1)JSC-20090406$$kJSC$$lJülich Supercomputing Center$$x0
001019542 980__ $$ajournal
001019542 980__ $$aVDB
001019542 980__ $$aUNRESTRICTED
001019542 980__ $$aI:(DE-Juel1)JSC-20090406
001019542 9801_ $$aFullTexts