001     1043552
005     20250724210254.0
024 7 _ |a 10.34734/FZJ-2025-02926
|2 datacite_doi
037 _ _ |a FZJ-2025-02926
100 1 _ |a Breuer, Thomas
|0 P:(DE-Juel1)138707
|b 0
|e Corresponding author
|u fzj
111 2 _ |a ISC High Performance 2025
|g ISC25
|c Hamburg
|d 2025-06-10 - 2025-06-14
|w Germany
245 _ _ |a The Art of Process Pinning: Turning Chaos into Core Harmony
260 _ _ |c 2025
336 7 _ |a Conference Paper
|0 33
|2 EndNote
336 7 _ |a INPROCEEDINGS
|2 BibTeX
336 7 _ |a conferenceObject
|2 DRIVER
336 7 _ |a CONFERENCE_POSTER
|2 ORCID
336 7 _ |a Output Types/Conference Poster
|2 DataCite
336 7 _ |a Poster
|b poster
|m poster
|0 PUB:(DE-HGF)24
|s 1753340973_1149
|2 PUB:(DE-HGF)
|x After Call
500 _ _ |a This poster was awarded second prize in the Best Research Poster category.
520 _ _ |a High-Performance Computing (HPC) centres face growing challenges as user numbers and application diversity increase, requiring systems to manage a wide range of workflows. While users prioritise scientific output over specific configurations, administrators strive to maintain fully utilised systems with optimised jobs, avoiding resource waste. However, no single default environment can address the diverse needs of users and applications due to the complex landscape of unique use cases. Process pinning - the binding of tasks and threads to specific CPU cores - is a vital yet often overlooked optimisation that significantly improves job performance. This technique benefits both CPU-intensive and GPU-enabled jobs. Proper pinning prevents process migration, ensures efficient memory access, and enables faster communication, improving system performance by simply adjusting workload manager parameters (e.g., Slurm) without altering code. Metrics from various applications and benchmarks show that suboptimal pinning can drastically reduce performance, with production scenarios likely impacted even more. Achieving optimal process pinning is challenging due to three interrelated factors: - System side: Application layers and libraries (e.g., MPI, OpenMP, Slurm) interact with hardware architectures, affecting task and thread placement. Updates to these components can disrupt the expected pinning behaviour. - User side: Users must consider system architecture and configuration options, such as how to split processes and threads or distribute them across cores. Even with the same core usage pattern, distribution can vary based on workload options (e.g., Slurm `cpu-bind` and `distribution` values). Portability across systems is not guaranteed, often leading to suboptimal performance. - Operator side: Administrators and support staff must monitor systems to ensure effective resource utilisation and address issues proactively. Identifying problematic jobs is difficult due to the variety of characteristics, with inefficiencies often hidden in core usage patterns. We developed tools and processes based on investigations across diverse HPC systems to address these challenges. These solutions enhance overall system throughput by identifying binding errors, guiding users in optimisation, and monitoring core usage. Our solutions include: - A workflow that validates pinning distributions by running automated test jobs, periodically or manually, via the GitLab-CI framework. Results are compared to expected outputs, with summaries generated and full comparison displayed on the provider-targeted part of the JuPin pinning tool (https://go.fzj.de/pinning). Tests help HPC providers address issues pre-production, update documentation, and notify users of changes. - A user-targeted interactive visualisation functionality of JuPin enables users to test pinning options, visualise task distributions, and generate Slurm-compatible commands. Though focused on Slurm, it can be adapted for other workload managers. - LLview (https://go.fzj.de/llview), an open-source monitoring and operational data analytics tool, has been extended to monitor core usage patterns, providing statistics and aggregated computing times. This helps identify inefficiencies and intervene proactively. JuPin and LLview collectively improve node utilisation, reduce waste, and simplify achieving optimal pinning. These advancements translate to delivering more results in less time. We published JuPin as open-source software on GitHub in May 2025 (https://github.com/FZJ-JSC/jupin). In conclusion, resolving pinning challenges is critical for optimising HPC systems. Our tools establish a foundation for scaling operations, including preparations for the JUPITER exascale supercomputer.
536 _ _ |a 5112 - Cross-Domain Algorithms, Tools, Methods Labs (ATMLs) and Research Groups (POF4-511)
|0 G:(DE-HGF)POF4-5112
|c POF4-511
|f POF IV
|x 0
536 _ _ |a 5121 - Supercomputing & Big Data Facilities (POF4-512)
|0 G:(DE-HGF)POF4-5121
|c POF4-512
|f POF IV
|x 1
536 _ _ |a BMBF 01 1H1 6013, NRW 325 – 8.03 – 133340 - SiVeGCS (DB001492)
|0 G:(DE-Juel-1)DB001492
|c DB001492
|x 2
536 _ _ |a ATMLAO - ATML Application Optimization and User Service Tools (ATMLAO)
|0 G:(DE-Juel-1)ATMLAO
|c ATMLAO
|x 3
700 1 _ |a Guimaraes, Filipe
|0 P:(DE-Juel1)162225
|b 1
|u fzj
700 1 _ |a Himmels, Carina
|0 P:(DE-Juel1)184480
|b 2
|u fzj
700 1 _ |a Frings, Wolfgang
|0 P:(DE-Juel1)132108
|b 3
|u fzj
700 1 _ |a Paschoulas, Chrysovalantis
|0 P:(DE-Juel1)137040
|b 4
|u fzj
700 1 _ |a Göbbert, Jens Henrik
|0 P:(DE-Juel1)168541
|b 5
|u fzj
856 4 _ |u //juser.fz-juelich.de/record/1043552/files/ISC25_JuPin_ResearchPoster.pdf
856 4 _ |u https://juser.fz-juelich.de/record/1043552/files/ISC25_JuPin_ResearchPoster.pdf
|y OpenAccess
909 C O |o oai:juser.fz-juelich.de:1043552
|p openaire
|p open_access
|p VDB
|p driver
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 0
|6 P:(DE-Juel1)138707
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 1
|6 P:(DE-Juel1)162225
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 2
|6 P:(DE-Juel1)184480
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 3
|6 P:(DE-Juel1)132108
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 4
|6 P:(DE-Juel1)137040
910 1 _ |a Forschungszentrum Jülich
|0 I:(DE-588b)5008462-8
|k FZJ
|b 5
|6 P:(DE-Juel1)168541
913 1 _ |a DE-HGF
|b Key Technologies
|l Engineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action
|1 G:(DE-HGF)POF4-510
|0 G:(DE-HGF)POF4-511
|3 G:(DE-HGF)POF4
|2 G:(DE-HGF)POF4-500
|4 G:(DE-HGF)POF
|v Enabling Computational- & Data-Intensive Science and Engineering
|9 G:(DE-HGF)POF4-5112
|x 0
913 1 _ |a DE-HGF
|b Key Technologies
|l Engineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action
|1 G:(DE-HGF)POF4-510
|0 G:(DE-HGF)POF4-512
|3 G:(DE-HGF)POF4
|2 G:(DE-HGF)POF4-500
|4 G:(DE-HGF)POF
|v Supercomputing & Big Data Infrastructures
|9 G:(DE-HGF)POF4-5121
|x 1
914 1 _ |y 2025
915 _ _ |a OpenAccess
|0 StatID:(DE-HGF)0510
|2 StatID
920 _ _ |l yes
920 1 _ |0 I:(DE-Juel1)JSC-20090406
|k JSC
|l Jülich Supercomputing Center
|x 0
980 _ _ |a poster
980 _ _ |a VDB
980 _ _ |a UNRESTRICTED
980 _ _ |a I:(DE-Juel1)JSC-20090406
980 1 _ |a FullTexts


LibraryCollectionCLSMajorCLSMinorLanguageAuthor
Marc 21