Contribution to a conference proceedings/Contribution to a book FZJ-2023-04455

http://join2-wiki.gsi.de/foswiki/pub/Main/Artwork/join2_logo100x88.png
End-to-End Process Orchestration of Earth Observation Data Workflows with Apache Airflow on High Performance Computing

 ;  ;  ;  ;  ;  ;  ;

2023
IEEE

IGARSS 2023 - 2023 IEEE International Geoscience and Remote Sensing Symposium : [Proceedings] - IEEE, 2023. - ISBN 979-8-3503-2010-7 - doi:10.1109/IGARSS52108.2023.10283416
IEEE International Geoscience and Remote Sensing Symposium (IGARSS), PasadenaPasadena, CA, 16 Jul 2023 - 21 Jul 20232023-07-162023-07-21
IEEE 711-714 () [10.1109/IGARSS52108.2023.10283416]

This record in other databases:  

Please use a persistent id in citations: doi:  doi:

Abstract: Earth Observation (EO) data processing faces challenges due to large volumes, multiple sources, and diverse formats. To address this issue, this paper presents a scalable and parallelizable workflow using Apache Airflow, capable of integrating Machine Learning (ML) and Deep Learning (DL) models with Modular Supercomputing Architecture (MSA) systems. To test the workflow, we considered the production of large-scale Land-Cover (LC) maps as a case study. The workflow manager, Airflow, offers scalability, extensibility, and programmable task definition in Python. It allows us to execute different steps of the workflow in different High-Performance Computing (HPC) systems. The workflow is demonstrated on the Dynamical Exascale Entry Platform (DEEP) and Jülich Research on Exascale Cluster Architectures (JURECA) hosted at the Jülich Supercomputing Centre (JSC), a platform that incorporates heterogeneous JSC systems.


Contributing Institute(s):
  1. Jülich Supercomputing Center (JSC)
Research Program(s):
  1. 5111 - Domain-Specific Simulation & Data Life Cycle Labs (SDLs) and Research Groups (POF4-511) (POF4-511)
  2. 5112 - Cross-Domain Algorithms, Tools, Methods Labs (ATMLs) and Research Groups (POF4-511) (POF4-511)
  3. RAISE - Research on AI- and Simulation-Based Engineering at Exascale (951733) (951733)
  4. EUROCC-2 (DEA02266) (DEA02266)

Appears in the scientific report 2023
Database coverage:
OpenAccess
Click to display QR Code for this record

The record appears in these collections:
Document types > Events > Contributions to a conference proceedings
Document types > Books > Contribution to a book
Workflow collections > Public records
Institute Collections > JSC
Publications database
Open Access

 Record created 2023-11-11, last modified 2024-04-03


OpenAccess:
Download fulltext PDF
Rate this document:

Rate this document:
1
2
3
 
(Not yet reviewed)