000893827 001__ 893827
000893827 005__ 20211023141247.0
000893827 0247_ $$2doi$$a10.1109/IPDPSW52791.2021.00019
000893827 0247_ $$2Handle$$a2128/28081
000893827 0247_ $$2altmetric$$aaltmetric:109125167
000893827 0247_ $$2WOS$$aWOS:000689576200008
000893827 037__ $$aFZJ-2021-02866
000893827 1001_ $$0P:(DE-Juel1)132239$$aRiedel, Morris$$b0$$eCorresponding author$$ufzj
000893827 1112_ $$aIEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)$$cPortland$$d2021-06-17 - 2021-06-21$$wUSA
000893827 245__ $$aPractice and Experience in using Parallel and Scalable Machine Learning with Heterogenous Modular Supercomputing Architectures
000893827 260__ $$bIEEE$$c2021
000893827 300__ $$a76-85
000893827 3367_ $$2ORCID$$aCONFERENCE_PAPER
000893827 3367_ $$033$$2EndNote$$aConference Paper
000893827 3367_ $$2BibTeX$$aINPROCEEDINGS
000893827 3367_ $$2DRIVER$$aconferenceObject
000893827 3367_ $$2DataCite$$aOutput Types/Conference Paper
000893827 3367_ $$0PUB:(DE-HGF)8$$2PUB:(DE-HGF)$$aContribution to a conference proceedings$$bcontrib$$mcontrib$$s1634912175_10456
000893827 520__ $$aWe observe a continuously increased use of Deep Learning (DL) as a specific type of Machine Learning (ML) for data-intensive problems (i.e., ’big data’) that requires powerful computing resources with equally increasing performance. Consequently, innovative heterogeneous High-Performance Computing (HPC) systems based on multi-core CPUs and many-core GPUs require an architectural design that addresses end user communities’ requirements that take advantage of ML and DL. Still the workloads of end user communities of the simulation sciences (e.g., using numerical methods based on known physical laws) needs to be equally supported in those architectures. This paper offers insights into the Modular Supercomputer Architecture (MSA) developed in the Dynamic Exascale Entry Platform (DEEP) series of projects to address the requirements of both simulation sciences and data-intensive sciences such as High Performance Data Analytics (HPDA). It shares insights into implementing the MSA in the Jülich Supercomputing Centre (JSC) hosting Europe No. 1 Supercomputer Jülich Wizard for European Leadership Science (JUWELS). We augment the technical findings with experience and lessons learned from two application communities case studies (i.e., remote sensing and health sciences) using the MSA with JUWELS and the DEEP systems in practice. Thus, the paper provides details into specific MSA design elements that enable significant performance improvements of ML and DL algorithms. While this paper focuses on MSA-based HPC systems and application experience, we are not losing sight of advances in Cloud Computing (CC) and Quantum Computing (QC) relevant for ML and DL.
000893827 536__ $$0G:(DE-HGF)POF4-5112$$a5112 - Cross-Domain Algorithms, Tools, Methods Labs (ATMLs) and Research Groups (POF4-511)$$cPOF4-511$$fPOF IV$$x0
000893827 536__ $$0G:(DE-HGF)POF4-5111$$a5111 - Domain-Specific Simulation & Data Life Cycle Labs (SDLs) and Research Groups (POF4-511)$$cPOF4-511$$fPOF IV$$x1
000893827 536__ $$0G:(EU-Grant)754304$$aDEEP-EST - DEEP - Extreme Scale Technologies (754304)$$c754304$$fH2020-FETHPC-2016$$x2
000893827 536__ $$0G:(EU-Grant)951733$$aAISee - AI- and Simulation-Based Engineering at Exascale (951733)$$c951733$$fH2020-INFRAEDI-2019-1$$x3
000893827 536__ $$0G:(EU-Grant)955606$$aDEEP-SEA - DEEP – SOFTWARE FOR EXASCALE ARCHITECTURES (955606)$$c955606$$fH2020-JTI-EuroHPC-2019-1$$x4
000893827 536__ $$0G:(EU-Grant)951732$$aEUROCC - National Competence Centres in the framework of EuroHPC (951732)$$c951732$$fH2020-JTI-EuroHPC-2019-2$$x5
000893827 588__ $$aDataset connected to CrossRef Conference
000893827 7001_ $$0P:(DE-Juel1)178695$$aSedona, Rocco$$b1$$ufzj
000893827 7001_ $$0P:(DE-Juel1)178934$$aBarakat, Chadi$$b2$$ufzj
000893827 7001_ $$0P:(DE-HGF)0$$aEinarsson, Petur$$b3
000893827 7001_ $$0P:(DE-HGF)0$$aHassanian, Reza$$b4
000893827 7001_ $$0P:(DE-Juel1)171343$$aCavallaro, Gabriele$$b5$$ufzj
000893827 7001_ $$0P:(DE-HGF)0$$aBook, Matthias$$b6
000893827 7001_ $$0P:(DE-HGF)0$$aNeukirchen, Helmut$$b7
000893827 7001_ $$0P:(DE-Juel1)165948$$aLintermann, Andreas$$b8$$ufzj
000893827 773__ $$a10.1109/IPDPSW52791.2021.00019
000893827 8564_ $$uhttps://juser.fz-juelich.de/record/893827/files/IPDPS_HCW_Camera_Ready-Final.pdf$$yOpenAccess
000893827 909CO $$ooai:juser.fz-juelich.de:893827$$pdnbdelivery$$pec_fundedresources$$pVDB$$pdriver$$popen_access$$popenaire
000893827 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)132239$$aForschungszentrum Jülich$$b0$$kFZJ
000893827 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)178695$$aForschungszentrum Jülich$$b1$$kFZJ
000893827 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)178934$$aForschungszentrum Jülich$$b2$$kFZJ
000893827 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)171343$$aForschungszentrum Jülich$$b5$$kFZJ
000893827 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)165948$$aForschungszentrum Jülich$$b8$$kFZJ
000893827 9131_ $$0G:(DE-HGF)POF4-511$$1G:(DE-HGF)POF4-510$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5112$$aDE-HGF$$bKey Technologies$$lEngineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action$$vEnabling Computational- & Data-Intensive Science and Engineering$$x0
000893827 9131_ $$0G:(DE-HGF)POF4-511$$1G:(DE-HGF)POF4-510$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5111$$aDE-HGF$$bKey Technologies$$lEngineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action$$vEnabling Computational- & Data-Intensive Science and Engineering$$x1
000893827 9141_ $$y2021
000893827 915__ $$0StatID:(DE-HGF)0510$$2StatID$$aOpenAccess
000893827 9201_ $$0I:(DE-Juel1)JSC-20090406$$kJSC$$lJülich Supercomputing Center$$x0
000893827 980__ $$acontrib
000893827 980__ $$aVDB
000893827 980__ $$aI:(DE-Juel1)JSC-20090406
000893827 980__ $$aUNRESTRICTED
000893827 9801_ $$aFullTexts