000893756 001__ 893756 000893756 005__ 20250822121434.0 000893756 0247_ $$2doi$$a10.1145/3452412.3462752 000893756 0247_ $$2Handle$$a2128/28078 000893756 0247_ $$2WOS$$aWOS:001322551200001 000893756 037__ $$aFZJ-2021-02811 000893756 041__ $$aEnglish 000893756 1001_ $$0P:(DE-Juel1)145478$$aHerten, Andreas$$b0$$eCorresponding author$$ufzj 000893756 1112_ $$aThe 30th International Symposium on High-Performance Parallel and Distributed Computing, PERMAVOST Workshop$$cVirtual$$d2021-06-21 - 2021-06-25$$gHPDC21$$wSweden 000893756 245__ $$aJUWELS Booster - Early User Experiences 000893756 260__ $$c2021 000893756 3367_ $$033$$2EndNote$$aConference Paper 000893756 3367_ $$2DataCite$$aOther 000893756 3367_ $$2BibTeX$$aINPROCEEDINGS 000893756 3367_ $$2DRIVER$$aconferenceObject 000893756 3367_ $$2ORCID$$aLECTURE_SPEECH 000893756 3367_ $$0PUB:(DE-HGF)6$$2PUB:(DE-HGF)$$aConference Presentation$$bconf$$mconf$$s1625903664_29986$$xPlenary/Keynote 000893756 520__ $$aOver the last few years, GPUs became ubiquitous in HPC installations around the world. Today, they provide the main source of performance in a number of Top500 machines - for example Summit, Sierra, and JUWELS Booster. Also for the upcoming Exascale era, GPUs are selected as key enablers and will be installed numerously. While individual GPU devices already offer plenty of performance (O (10) TFLOP/s FP64), current and next-generation super-computers employ them in the thousands. Using these machines to the fullest extend means not only utilizing individual devices efficiently, but using the entire interconnected system of devices thoroughly.JUWELS Booster is a recently installed Tier-0/1 system at Jülich Supercomputing Centre (JSC), currently the 7th-fastest supercomputer in the world, and the fastest in Europe. JUWELS Booster features 936 nodes, each equipped with 4 NVIDIA A100 Tensor Core GPUs and 4 Mellanox HDR200 InfiniBand HCAs. The peak performance of all GPUs together sums up to 73 PFLOP/s and it features a DragonFly+ network topology with 800 Gbit/s network injection bandwidth per node.During installation of JUWELS Booster, a selected set of applications were given access to the system as part of the JUWELS Booster Early Access Program. To prepare for their first compute time allocation, scientific users were able to gain first experiences on the machine. They gave direct feedback to the system operations team during installation and beyond. Close collaboration was facilitated with the application support staff of JSC, giving unique insights into the individual processes of utilizing a brand-new large-sale system for a first time. Likewise, performance profiles of applications could be studied and collaboratively analyzed, employing available tools and methods. Performance limiters of the specific application on the platform were identified and proposals for improvement developed.This talk will present first experiences with JUWELS Booster and the applications utilizing the system during its first months. Applied methods for onboarding, analysis, and optimization will be shown and assessed. Highlights of the state of the art of performance analysis and modeling for GPUs will be presented with concrete examples from the JUWELS Booster Early Access Program. 000893756 536__ $$0G:(DE-HGF)POF4-5121$$a5121 - Supercomputing & Big Data Facilities (POF4-512)$$cPOF4-512$$fPOF IV$$x0 000893756 536__ $$0G:(DE-HGF)POF4-5112$$a5112 - Cross-Domain Algorithms, Tools, Methods Labs (ATMLs) and Research Groups (POF4-511)$$cPOF4-511$$fPOF IV$$x1 000893756 536__ $$0G:(DE-Juel-1)ATML-X-DEV$$aATML-X-DEV - ATML Accelerating Devices (ATML-X-DEV)$$cATML-X-DEV$$x2 000893756 588__ $$aDataset connected to CrossRef Conference 000893756 773__ $$a10.1145/3452412.3462752 000893756 8564_ $$uhttps://permavost.github.io/ 000893756 8564_ $$uhttps://juser.fz-juelich.de/record/893756/files/3452412.3462752.pdf$$yOpenAccess 000893756 8564_ $$uhttps://juser.fz-juelich.de/record/893756/files/Overlays-reduced%20Slides.pdf$$yRestricted 000893756 8564_ $$uhttps://juser.fz-juelich.de/record/893756/files/Slides.pdf$$yRestricted 000893756 909CO $$ooai:juser.fz-juelich.de:893756$$pdriver$$pVDB$$popen_access$$popenaire 000893756 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)145478$$aForschungszentrum Jülich$$b0$$kFZJ 000893756 9131_ $$0G:(DE-HGF)POF4-512$$1G:(DE-HGF)POF4-510$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5121$$aDE-HGF$$bKey Technologies$$lEngineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action$$vSupercomputing & Big Data Infrastructures$$x0 000893756 9131_ $$0G:(DE-HGF)POF4-511$$1G:(DE-HGF)POF4-510$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5112$$aDE-HGF$$bKey Technologies$$lEngineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action$$vEnabling Computational- & Data-Intensive Science and Engineering$$x1 000893756 9141_ $$y2021 000893756 915__ $$0StatID:(DE-HGF)0510$$2StatID$$aOpenAccess 000893756 920__ $$lyes 000893756 9201_ $$0I:(DE-Juel1)JSC-20090406$$kJSC$$lJülich Supercomputing Center$$x0 000893756 980__ $$aconf 000893756 980__ $$aVDB 000893756 980__ $$aUNRESTRICTED 000893756 980__ $$aI:(DE-Juel1)JSC-20090406 000893756 9801_ $$aFullTexts