001     1049545
005     20251223202201.0
024 7 _ |a 10.3389/fhpcp.2025.1669101
|2 doi
037 _ _ |a FZJ-2025-05349
082 _ _ |a 004
100 1 _ |a Falquez, Carlos
|0 P:(DE-Juel1)179531
|b 0
|u fzj
245 _ _ |a Processor simulation as a tool for performance engineering
260 _ _ |a Beijing
|c 2025
|b Frontiers Media SA
336 7 _ |a article
|2 DRIVER
336 7 _ |a Output Types/Journal article
|2 DataCite
336 7 _ |a Journal Article
|b journal
|m journal
|0 PUB:(DE-HGF)16
|s 1766499915_20624
|2 PUB:(DE-HGF)
336 7 _ |a ARTICLE
|2 BibTeX
336 7 _ |a JOURNAL_ARTICLE
|2 ORCID
336 7 _ |a Journal Article
|0 0
|2 EndNote
520 _ _ |a The diversity of processor architectures used for High-Performance Computing (HPC) applications has increased significantly over the last few years. This trend is expected to continue for different reasons, including the emergence of various instruction set extensions. Examples are the renewed interest in vector instructions like Arm's Scalable Vector Extension (SVE) or RISC-V's RVV. For application developers, research software developers, and performance engineers, the increased diversity and complexity of architectures have led to the following challenges: Limited access to these different processor architectures and more difficult root cause analysis in case of performance issues. To address these challenges, we propose leveraging the much-improved capabilities of processor simulators such as gem5. We enhanced this simulator with a performance analysis framework. We extend available performance counters and introduce new analysis capabilities to track the temporal behaviour of running applications. An algorithm has been implemented to link these statistics to specific regions. The resulting performance profiles allow for the identification of code regions with the potential for optimization. The focus is on observables to monitor quantities that are usually not directly accessible on real hardware. Different algorithms have been implemented to identify potential performance bottlenecks. The framework is evaluated for different types of HPC applications like the molecular-dynamics application GROMACS, Ligra, which implements the breadth-first search (BFS) algorithm, and a kernel from the Lattice QCD solver DD-αAMG.
536 _ _ |a 5122 - Future Computing & Big Data Systems (POF4-512)
|0 G:(DE-HGF)POF4-5122
|c POF4-512
|f POF IV
|x 0
536 _ _ |a EPI SGA2 (16ME0507K)
|0 G:(BMBF)16ME0507K
|c 16ME0507K
|x 1
536 _ _ |a The European PILOT - Pilot using Independent Local & Open Technologies (101034126)
|0 G:(EU-Grant)101034126
|c 101034126
|f H2020-JTI-EuroHPC-2020-1
|x 2
536 _ _ |a AQTIVATE - Advanced computing, quantum algorithms, and data-driven approaches for science, technology and engineering (101072344)
|0 G:(EU-Grant)101072344
|c 101072344
|f HORIZON-MSCA-2021-DN-01
|x 3
588 _ _ |a Dataset connected to CrossRef, Journals: juser.fz-juelich.de
700 1 _ |a Long, Shiting
|0 P:(DE-Juel1)185766
|b 1
700 1 _ |a Ho, Nam
|0 P:(DE-Juel1)176469
|b 2
|u fzj
700 1 _ |a Suarez, Estela
|0 P:(DE-Juel1)142361
|b 3
|u fzj
700 1 _ |a Pleiter, Dirk
|0 P:(DE-Juel1)144441
|b 4
|e Corresponding author
773 _ _ |a 10.3389/fhpcp.2025.1669101
|g Vol. 3, p. 1669101
|0 PERI:(DE-600)3210045-0
|p 1669101
|t Frontiers in high performance computing
|v 3
|y 2025
|x 2813-7337
856 4 _ |u https://juser.fz-juelich.de/record/1049545/files/fhpcp-3-1669101.pdf
|y Restricted
909 C O |o oai:juser.fz-juelich.de:1049545
|p openaire
|p ec_fundedresources
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 0
|6 P:(DE-Juel1)179531
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 2
|6 P:(DE-Juel1)176469
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 3
|6 P:(DE-Juel1)142361
913 1 _ |a DE-HGF
|b Key Technologies
|l Engineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action
|1 G:(DE-HGF)POF4-510
|0 G:(DE-HGF)POF4-512
|3 G:(DE-HGF)POF4
|2 G:(DE-HGF)POF4-500
|4 G:(DE-HGF)POF
|v Supercomputing & Big Data Infrastructures
|9 G:(DE-HGF)POF4-5122
|x 0
914 1 _ |y 2025
920 _ _ |l yes
920 1 _ |0 I:(DE-Juel1)JSC-20090406
|k JSC
|l Jülich Supercomputing Center
|x 0
980 _ _ |a journal
980 _ _ |a EDITORS
980 _ _ |a VDBINPRINT
980 _ _ |a I:(DE-Juel1)JSC-20090406
980 _ _ |a UNRESTRICTED


LibraryCollectionCLSMajorCLSMinorLanguageAuthor
Marc 21