001049545 001__ 1049545
001049545 005__ 20251223202201.0
001049545 0247_ $$2doi$$a10.3389/fhpcp.2025.1669101
001049545 037__ $$aFZJ-2025-05349
001049545 082__ $$a004
001049545 1001_ $$0P:(DE-Juel1)179531$$aFalquez, Carlos$$b0$$ufzj
001049545 245__ $$aProcessor simulation as a tool for performance engineering
001049545 260__ $$aBeijing$$bFrontiers Media SA$$c2025
001049545 3367_ $$2DRIVER$$aarticle
001049545 3367_ $$2DataCite$$aOutput Types/Journal article
001049545 3367_ $$0PUB:(DE-HGF)16$$2PUB:(DE-HGF)$$aJournal Article$$bjournal$$mjournal$$s1766499915_20624
001049545 3367_ $$2BibTeX$$aARTICLE
001049545 3367_ $$2ORCID$$aJOURNAL_ARTICLE
001049545 3367_ $$00$$2EndNote$$aJournal Article
001049545 520__ $$aThe diversity of processor architectures used for High-Performance Computing (HPC) applications has increased significantly over the last few years. This trend is expected to continue for different reasons, including the emergence of various instruction set extensions. Examples are the renewed interest in vector instructions like Arm's Scalable Vector Extension (SVE) or RISC-V's RVV. For application developers, research software developers, and performance engineers, the increased diversity and complexity of architectures have led to the following challenges: Limited access to these different processor architectures and more difficult root cause analysis in case of performance issues. To address these challenges, we propose leveraging the much-improved capabilities of processor simulators such as gem5. We enhanced this simulator with a performance analysis framework. We extend available performance counters and introduce new analysis capabilities to track the temporal behaviour of running applications. An algorithm has been implemented to link these statistics to specific regions. The resulting performance profiles allow for the identification of code regions with the potential for optimization. The focus is on observables to monitor quantities that are usually not directly accessible on real hardware. Different algorithms have been implemented to identify potential performance bottlenecks. The framework is evaluated for different types of HPC applications like the molecular-dynamics application GROMACS, Ligra, which implements the breadth-first search (BFS) algorithm, and a kernel from the Lattice QCD solver DD-αAMG.
001049545 536__ $$0G:(DE-HGF)POF4-5122$$a5122 - Future Computing & Big Data Systems (POF4-512)$$cPOF4-512$$fPOF IV$$x0
001049545 536__ $$0G:(BMBF)16ME0507K$$aEPI SGA2 (16ME0507K)$$c16ME0507K$$x1
001049545 536__ $$0G:(EU-Grant)101034126$$aThe European PILOT - Pilot using Independent Local & Open Technologies (101034126)$$c101034126$$fH2020-JTI-EuroHPC-2020-1$$x2
001049545 536__ $$0G:(EU-Grant)101072344$$aAQTIVATE - Advanced computing, quantum algorithms, and data-driven approaches for science, technology and engineering (101072344)$$c101072344$$fHORIZON-MSCA-2021-DN-01$$x3
001049545 588__ $$aDataset connected to CrossRef, Journals: juser.fz-juelich.de
001049545 7001_ $$0P:(DE-Juel1)185766$$aLong, Shiting$$b1
001049545 7001_ $$0P:(DE-Juel1)176469$$aHo, Nam$$b2$$ufzj
001049545 7001_ $$0P:(DE-Juel1)142361$$aSuarez, Estela$$b3$$ufzj
001049545 7001_ $$0P:(DE-Juel1)144441$$aPleiter, Dirk$$b4$$eCorresponding author
001049545 773__ $$0PERI:(DE-600)3210045-0$$a10.3389/fhpcp.2025.1669101$$gVol. 3, p. 1669101$$p1669101$$tFrontiers in high performance computing$$v3$$x2813-7337$$y2025
001049545 8564_ $$uhttps://juser.fz-juelich.de/record/1049545/files/fhpcp-3-1669101.pdf$$yRestricted
001049545 909CO $$ooai:juser.fz-juelich.de:1049545$$popenaire$$pec_fundedresources
001049545 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)179531$$aForschungszentrum Jülich$$b0$$kFZJ
001049545 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)176469$$aForschungszentrum Jülich$$b2$$kFZJ
001049545 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)142361$$aForschungszentrum Jülich$$b3$$kFZJ
001049545 9131_ $$0G:(DE-HGF)POF4-512$$1G:(DE-HGF)POF4-510$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5122$$aDE-HGF$$bKey Technologies$$lEngineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action$$vSupercomputing & Big Data Infrastructures$$x0
001049545 9141_ $$y2025
001049545 920__ $$lyes
001049545 9201_ $$0I:(DE-Juel1)JSC-20090406$$kJSC$$lJülich Supercomputing Center$$x0
001049545 980__ $$ajournal
001049545 980__ $$aEDITORS
001049545 980__ $$aVDBINPRINT
001049545 980__ $$aI:(DE-Juel1)JSC-20090406
001049545 980__ $$aUNRESTRICTED