001042334 001__ 1042334
001042334 005__ 20250804115218.0
001042334 0247_ $$2doi$$a10.1109/ACCESS.2025.3569533
001042334 0247_ $$2datacite_doi$$a10.34734/FZJ-2025-02537
001042334 0247_ $$2WOS$$aWOS:001492121500023
001042334 037__ $$aFZJ-2025-02537
001042334 082__ $$a621.3
001042334 1001_ $$0P:(DE-Juel1)176469$$aHo, Nam$$b0$$eCorresponding author$$ufzj
001042334 245__ $$aMemory Prefetching Evaluation of Scientific Applications on A Modern HPC Arm-based Processor
001042334 260__ $$aNew York, NY$$bIEEE$$c2025
001042334 3367_ $$2DRIVER$$aarticle
001042334 3367_ $$2DataCite$$aOutput Types/Journal article
001042334 3367_ $$0PUB:(DE-HGF)16$$2PUB:(DE-HGF)$$aJournal Article$$bjournal$$mjournal$$s1753033141_20204
001042334 3367_ $$2BibTeX$$aARTICLE
001042334 3367_ $$2ORCID$$aJOURNAL_ARTICLE
001042334 3367_ $$00$$2EndNote$$aJournal Article
001042334 520__ $$aMemory prefetching is a well-known technique for mitigating the negative impact of memory access latencies on memory bandwidth. This problem has become more pressing as improvements in memory bandwidth have not kept pace with increases in computational power. While much existing work has been devoted to finding appropriate prefetching techniques for specific workloads, few provide insight into the behavior of scientific applications to better understand the impact of prefetchers. This paper investigates the impact of hardware prefetchers on the latest Arm-based high-end processor architectures. In this work, we investigate memory access patterns by analyzing locality properties and visualizing delta and repetitive address patterns. A deeper understanding of memory access patterns allows the use of the appropriate prefetcher and reaching a better correlation between access pattern properties and prefetcher performance. This can guide future co-design efforts. We evaluated traditional and innovative prefetchers using a gem5-based model of Arm Neoverse V1 cores. The model features a 16-core architecture, using Amazon’s Graviton 3 processor as a hardware reference, but substituting DDR5 by high bandwidth memory (HBM2). We performed a detailed prefetching evaluation focusing on stencil, sparse matrix-vector multiplication, and Breadth-First Search kernels. These kernels represent a broad range of the applications running on today’s High-Performance Computing (HPC) systems, which are sensitive to memory performance.
001042334 536__ $$0G:(DE-HGF)POF4-5122$$a5122 - Future Computing & Big Data Systems (POF4-512)$$cPOF4-512$$fPOF IV$$x0
001042334 536__ $$0G:(BMBF)16ME0507K$$aEPI SGA2 (16ME0507K)$$c16ME0507K$$x1
001042334 536__ $$0G:(EU-Grant)826647$$aEPI SGA1 - SGA1 (Specific Grant Agreement 1) OF THE EUROPEAN PROCESSOR INITIATIVE (EPI) (826647)$$c826647$$fH2020-SGA-LPMT-2018$$x2
001042334 588__ $$aDataset connected to DataCite
001042334 7001_ $$0P:(DE-Juel1)179531$$aFALQUEZ, CARLOS$$b1$$ufzj
001042334 7001_ $$0P:(DE-Juel1)177768$$aPORTERO, ANTONI$$b2
001042334 7001_ $$0P:(DE-Juel1)142361$$aSUAREZ, ESTELA$$b3$$ufzj
001042334 7001_ $$0P:(DE-Juel1)144441$$aPLEITER, DIRK$$b4
001042334 773__ $$0PERI:(DE-600)2687964-5$$a10.1109/ACCESS.2025.3569533$$p85898 - 85926$$tIEEE access$$v13$$x2169-3536$$y2025
001042334 8564_ $$uhttps://juser.fz-juelich.de/record/1042334/files/APC600663786.pdf
001042334 8564_ $$uhttps://juser.fz-juelich.de/record/1042334/files/Memory_Prefetching_Evaluation_of_Scientific_Applications_on_a_Modern_HPC_Arm-Based_Processor.pdf$$yOpenAccess
001042334 8767_ $$8APC600663786$$92025-05-12$$a1200214093$$d2025-05-19$$eAPC$$jZahlung erfolgt$$z2075 USD
001042334 909CO $$ooai:juser.fz-juelich.de:1042334$$pdnbdelivery$$popenCost$$pec_fundedresources$$pVDB$$pdriver$$pOpenAPC$$popen_access$$popenaire
001042334 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)176469$$aForschungszentrum Jülich$$b0$$kFZJ
001042334 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)179531$$aForschungszentrum Jülich$$b1$$kFZJ
001042334 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)142361$$aForschungszentrum Jülich$$b3$$kFZJ
001042334 9131_ $$0G:(DE-HGF)POF4-512$$1G:(DE-HGF)POF4-510$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5122$$aDE-HGF$$bKey Technologies$$lEngineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action$$vSupercomputing & Big Data Infrastructures$$x0
001042334 9141_ $$y2025
001042334 915pc $$0PC:(DE-HGF)0000$$2APC$$aAPC keys set
001042334 915pc $$0PC:(DE-HGF)0003$$2APC$$aDOAJ Journal
001042334 915__ $$0StatID:(DE-HGF)0200$$2StatID$$aDBCoverage$$bSCOPUS$$d2025-01-02
001042334 915__ $$0StatID:(DE-HGF)0160$$2StatID$$aDBCoverage$$bEssential Science Indicators$$d2025-01-02
001042334 915__ $$0StatID:(DE-HGF)1160$$2StatID$$aDBCoverage$$bCurrent Contents - Engineering, Computing and Technology$$d2025-01-02
001042334 915__ $$0LIC:(DE-HGF)CCBY4$$2HGFVOC$$aCreative Commons Attribution CC BY 4.0
001042334 915__ $$0StatID:(DE-HGF)0100$$2StatID$$aJCR$$bIEEE ACCESS : 2022$$d2025-01-02
001042334 915__ $$0StatID:(DE-HGF)0501$$2StatID$$aDBCoverage$$bDOAJ Seal$$d2024-04-03T10:39:05Z
001042334 915__ $$0StatID:(DE-HGF)0500$$2StatID$$aDBCoverage$$bDOAJ$$d2024-04-03T10:39:05Z
001042334 915__ $$0StatID:(DE-HGF)0113$$2StatID$$aWoS$$bScience Citation Index Expanded$$d2025-01-02
001042334 915__ $$0StatID:(DE-HGF)0700$$2StatID$$aFees$$d2025-01-02
001042334 915__ $$0StatID:(DE-HGF)0150$$2StatID$$aDBCoverage$$bWeb of Science Core Collection$$d2025-01-02
001042334 915__ $$0StatID:(DE-HGF)9900$$2StatID$$aIF < 5$$d2025-01-02
001042334 915__ $$0StatID:(DE-HGF)0510$$2StatID$$aOpenAccess
001042334 915__ $$0StatID:(DE-HGF)0030$$2StatID$$aPeer Review$$bDOAJ : Anonymous peer review$$d2024-04-03T10:39:05Z
001042334 915__ $$0StatID:(DE-HGF)0561$$2StatID$$aArticle Processing Charges$$d2025-01-02
001042334 915__ $$0StatID:(DE-HGF)1230$$2StatID$$aDBCoverage$$bCurrent Contents - Electronics and Telecommunications Collection$$d2025-01-02
001042334 915__ $$0StatID:(DE-HGF)0300$$2StatID$$aDBCoverage$$bMedline$$d2025-01-02
001042334 915__ $$0StatID:(DE-HGF)0199$$2StatID$$aDBCoverage$$bClarivate Analytics Master Journal List$$d2025-01-02
001042334 920__ $$lyes
001042334 9201_ $$0I:(DE-Juel1)JSC-20090406$$kJSC$$lJülich Supercomputing Center$$x0
001042334 980__ $$ajournal
001042334 980__ $$aVDB
001042334 980__ $$aUNRESTRICTED
001042334 980__ $$aI:(DE-Juel1)JSC-20090406
001042334 980__ $$aAPC
001042334 9801_ $$aAPC
001042334 9801_ $$aFullTexts