Book/Report FZJ-2024-02201

http://join2-wiki.gsi.de/foswiki/pub/Main/Artwork/join2_logo100x88.png
Integrating HPC, AI, and Workflows for Scientific Data Analysis (Dagstuhl Seminar 23352)

 ;  ;  ;  ;

2024
Schloss Dagstuhl – Leibniz-Zentrum für InformatikSchloss Dagstuhl – Leibniz-Zentrum für Informatik

Schloss Dagstuhl – Leibniz-Zentrum für InformatikSchloss Dagstuhl – Leibniz-Zentrum für Informatik, Dagstuhl Reports 13, 36 pages () [10.4230/DAGREP.13.8.129]

This record in other databases:  

Please use a persistent id in citations: doi:  doi:

Report No.: 13/8

Abstract: The Dagstuhl Seminar 23352, titled 'Integrating HPC, AI, and Workflows for Scientific Data Analysis,' held from August 27 to September 1, 2023, was a significant event focusing on the synergy between High-Performance Computing (HPC), Artificial Intelligence (AI), and scientific workflow technologies. The seminar recognized that modern Big Data analysis in science rests on three pillars: workflow technologies for reproducibility and steering, AI and Machine Learning (ML) for versatile analysis, and HPC for handling large data sets. These elements, while crucial, have traditionally been researched separately, leading to gaps in their integration. The seminar aimed to bridge these gaps, acknowledging the challenges and opportunities at the intersection of these technologies. The event highlighted the complex interplay between HPC, workflows, and ML, noting how ML has increasingly been integrated into scientific workflows, thereby enhancing resource demands and bringing new requirements to HPC architectures, like support for GPUs and iterative computations. The seminar also addressed the challenges in adapting HPC for large-scale ML tasks, including in areas like deep learning, and the need for workflow systems to evolve to leverage ML in data analysis fully. Moreover, the seminar explored how ML could optimize scientific workflow systems and HPC operations, such as through improved scheduling and fault tolerance. A key focus was on identifying prestigious use cases of ML in HPC and understanding their unique, unmet requirements. The stochastic nature of ML and its impact on the reproducibility of data analysis on HPC systems was also a topic of discussion.

Keyword(s): Large scale data presentation and analysis ; Exascale class machine optimization ; Performance data analysis and root cause detection ; High dimensional data representation ; Computing methodologies → Distributed computing methodologies ; Computing methodologies → Machine learning ; Computing methodologies → Parallel computing methodologies


Contributing Institute(s):
  1. Jülich Supercomputing Center (JSC)
Research Program(s):
  1. 5112 - Cross-Domain Algorithms, Tools, Methods Labs (ATMLs) and Research Groups (POF4-511) (POF4-511)

Appears in the scientific report 2024
Database coverage:
Creative Commons Attribution CC BY 4.0 ; OpenAccess
Click to display QR Code for this record

The record appears in these collections:
Document types > Reports > Reports
Document types > Books > Books
Workflow collections > Public records
Institute Collections > JSC
Publications database
Open Access

 Record created 2024-04-03, last modified 2024-04-11


OpenAccess:
Download fulltext PDF
Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)