000859687 001__ 859687
000859687 005__ 20210130000351.0
000859687 0247_ $$2arXiv$$aarXiv:1808.05506
000859687 0247_ $$2Handle$$a2128/21382
000859687 0247_ $$2altmetric$$aaltmetric:46609682
000859687 037__ $$aFZJ-2019-00526
000859687 041__ $$aEnglish
000859687 1001_ $$0P:(DE-Juel1)132580$$aDurr, Stephan$$b0$$eCorresponding author
000859687 1112_ $$a36th Annual International Symposium on Lattice Field Theory, Lattice 2018$$cEast Lansing$$d2018-07-22 - 2018-07-28$$gLattice 2018$$wUSA
000859687 245__ $$aThree Dirac operators on two architectures with one piece of code and no hassle
000859687 260__ $$aTrieste$$bSISSA$$c2018
000859687 300__ $$a7 p.
000859687 3367_ $$2ORCID$$aCONFERENCE_PAPER
000859687 3367_ $$033$$2EndNote$$aConference Paper
000859687 3367_ $$2BibTeX$$aINPROCEEDINGS
000859687 3367_ $$2DRIVER$$aconferenceObject
000859687 3367_ $$2DataCite$$aOutput Types/Conference Paper
000859687 3367_ $$0PUB:(DE-HGF)8$$2PUB:(DE-HGF)$$aContribution to a conference proceedings$$bcontrib$$mcontrib$$s1548423558_27135
000859687 3367_ $$0PUB:(DE-HGF)7$$2PUB:(DE-HGF)$$aContribution to a book$$mcontb
000859687 4900_ $$aProceedings of Science$$vLATTICE2018
000859687 520__ $$aA simple minded approach to implement three discretizations of the Dirac operator (staggered, Wilson, Brillouin) on two architectures (KNL and core i7) is presented. The idea is to use a high-level compiler along with OpenMP parallelization and SIMD pragmas, but to stay away from cache-line optimization and/or assembly-tuning. The implementation is for N_v right-hand-sides, and this extra index is used to fill the SIMD pipeline. On one KNL node single precision performance figures for N_c=3, N_v=12 read 475 Gflop/s, 345 Gflop/s, and 790 Gflop/s for the three discretization schemes, respectively.
000859687 536__ $$0G:(DE-HGF)POF3-511$$a511 - Computational Science and Mathematical Methods (POF3-511)$$cPOF3-511$$fPOF III$$x0
000859687 588__ $$aDataset connected to arXivarXiv
000859687 773__ $$p033
000859687 8564_ $$uhttps://juser.fz-juelich.de/record/859687/files/1808.05506.pdf$$yOpenAccess
000859687 8564_ $$uhttps://juser.fz-juelich.de/record/859687/files/1808.05506.pdf?subformat=pdfa$$xpdfa$$yOpenAccess
000859687 909CO $$ooai:juser.fz-juelich.de:859687$$pdnbdelivery$$pdriver$$pVDB$$popen_access$$popenaire
000859687 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)132580$$aForschungszentrum Jülich$$b0$$kFZJ
000859687 9131_ $$0G:(DE-HGF)POF3-511$$1G:(DE-HGF)POF3-510$$2G:(DE-HGF)POF3-500$$3G:(DE-HGF)POF3$$4G:(DE-HGF)POF$$aDE-HGF$$bKey Technologies$$lSupercomputing & Big Data$$vComputational Science and Mathematical Methods$$x0
000859687 9141_ $$y2018
000859687 915__ $$0StatID:(DE-HGF)0510$$2StatID$$aOpenAccess
000859687 915__ $$0LIC:(DE-HGF)CCBYNCND4$$2HGFVOC$$aCreative Commons Attribution-NonCommercial-NoDerivs CC BY-NC-ND 4.0
000859687 920__ $$lyes
000859687 9201_ $$0I:(DE-Juel1)JSC-20090406$$kJSC$$lJülich Supercomputing Center$$x0
000859687 980__ $$acontrib
000859687 980__ $$aVDB
000859687 980__ $$aUNRESTRICTED
000859687 980__ $$acontb
000859687 980__ $$aI:(DE-Juel1)JSC-20090406
000859687 9801_ $$aFullTexts