Home > Publications database > End-to-End Process Orchestration of Earth Observation Data Workflows with Apache Airflow on High Performance Computing > print |
001 | 1017950 | ||
005 | 20240403082756.0 | ||
024 | 7 | _ | |a 10.1109/IGARSS52108.2023.10283416 |2 doi |
024 | 7 | _ | |a 10.34734/FZJ-2023-04455 |2 datacite_doi |
024 | 7 | _ | |a WOS:001098971601004 |2 WOS |
037 | _ | _ | |a FZJ-2023-04455 |
100 | 1 | _ | |a Tian, Liang |0 P:(DE-HGF)0 |b 0 |
111 | 2 | _ | |a IEEE International Geoscience and Remote Sensing Symposium (IGARSS) |c Pasadena |d 2023-07-16 - 2023-07-21 |w CA |
245 | _ | _ | |a End-to-End Process Orchestration of Earth Observation Data Workflows with Apache Airflow on High Performance Computing |
260 | _ | _ | |c 2023 |b IEEE |
295 | 1 | 0 | |a IGARSS 2023 - 2023 IEEE International Geoscience and Remote Sensing Symposium : [Proceedings] - IEEE, 2023. - ISBN 979-8-3503-2010-7 - doi:10.1109/IGARSS52108.2023.10283416 |
300 | _ | _ | |a 711-714 |
336 | 7 | _ | |a CONFERENCE_PAPER |2 ORCID |
336 | 7 | _ | |a Conference Paper |0 33 |2 EndNote |
336 | 7 | _ | |a INPROCEEDINGS |2 BibTeX |
336 | 7 | _ | |a conferenceObject |2 DRIVER |
336 | 7 | _ | |a Output Types/Conference Paper |2 DataCite |
336 | 7 | _ | |a Contribution to a conference proceedings |b contrib |m contrib |0 PUB:(DE-HGF)8 |s 1702456531_7741 |2 PUB:(DE-HGF) |
336 | 7 | _ | |a Contribution to a book |0 PUB:(DE-HGF)7 |2 PUB:(DE-HGF) |m contb |
520 | _ | _ | |a Earth Observation (EO) data processing faces challenges due to large volumes, multiple sources, and diverse formats. To address this issue, this paper presents a scalable and parallelizable workflow using Apache Airflow, capable of integrating Machine Learning (ML) and Deep Learning (DL) models with Modular Supercomputing Architecture (MSA) systems. To test the workflow, we considered the production of large-scale Land-Cover (LC) maps as a case study. The workflow manager, Airflow, offers scalability, extensibility, and programmable task definition in Python. It allows us to execute different steps of the workflow in different High-Performance Computing (HPC) systems. The workflow is demonstrated on the Dynamical Exascale Entry Platform (DEEP) and Jülich Research on Exascale Cluster Architectures (JURECA) hosted at the Jülich Supercomputing Centre (JSC), a platform that incorporates heterogeneous JSC systems. |
536 | _ | _ | |a 5111 - Domain-Specific Simulation & Data Life Cycle Labs (SDLs) and Research Groups (POF4-511) |0 G:(DE-HGF)POF4-5111 |c POF4-511 |f POF IV |x 0 |
536 | _ | _ | |a 5112 - Cross-Domain Algorithms, Tools, Methods Labs (ATMLs) and Research Groups (POF4-511) |0 G:(DE-HGF)POF4-5112 |c POF4-511 |f POF IV |x 1 |
536 | _ | _ | |a RAISE - Research on AI- and Simulation-Based Engineering at Exascale (951733) |0 G:(EU-Grant)951733 |c 951733 |f H2020-INFRAEDI-2019-1 |x 2 |
536 | _ | _ | |a EUROCC-2 (DEA02266) |0 G:(DE-Juel-1)DEA02266 |c DEA02266 |x 3 |
588 | _ | _ | |a Dataset connected to CrossRef Conference |
700 | 1 | _ | |a Sedona, Rocco |0 P:(DE-Juel1)178695 |b 1 |u fzj |
700 | 1 | _ | |a Mozaffari, Amirpasha |0 P:(DE-Juel1)166264 |b 2 |u fzj |
700 | 1 | _ | |a Kreshpa, Enxhi |0 P:(DE-Juel1)188445 |b 3 |u fzj |
700 | 1 | _ | |a Paris, Claudia |0 P:(DE-HGF)0 |b 4 |
700 | 1 | _ | |a Riedel, Morris |0 P:(DE-Juel1)132239 |b 5 |u fzj |
700 | 1 | _ | |a Schultz, Martin G. |0 P:(DE-Juel1)6952 |b 6 |u fzj |
700 | 1 | _ | |a Cavallaro, Gabriele |0 P:(DE-Juel1)171343 |b 7 |u fzj |
773 | _ | _ | |a 10.1109/IGARSS52108.2023.10283416 |
856 | 4 | _ | |u https://juser.fz-juelich.de/record/1017950/files/Liang_Tian_IGARSS_2023.pdf |y OpenAccess |
909 | C | O | |o oai:juser.fz-juelich.de:1017950 |p openaire |p open_access |p driver |p VDB |p ec_fundedresources |p dnbdelivery |
910 | 1 | _ | |a Forschungszentrum Jülich |0 I:(DE-588b)5008462-8 |k FZJ |b 1 |6 P:(DE-Juel1)178695 |
910 | 1 | _ | |a Forschungszentrum Jülich |0 I:(DE-588b)5008462-8 |k FZJ |b 2 |6 P:(DE-Juel1)166264 |
910 | 1 | _ | |a Forschungszentrum Jülich |0 I:(DE-588b)5008462-8 |k FZJ |b 3 |6 P:(DE-Juel1)188445 |
910 | 1 | _ | |a Forschungszentrum Jülich |0 I:(DE-588b)5008462-8 |k FZJ |b 5 |6 P:(DE-Juel1)132239 |
910 | 1 | _ | |a Forschungszentrum Jülich |0 I:(DE-588b)5008462-8 |k FZJ |b 6 |6 P:(DE-Juel1)6952 |
910 | 1 | _ | |a Forschungszentrum Jülich |0 I:(DE-588b)5008462-8 |k FZJ |b 7 |6 P:(DE-Juel1)171343 |
913 | 1 | _ | |a DE-HGF |b Key Technologies |l Engineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action |1 G:(DE-HGF)POF4-510 |0 G:(DE-HGF)POF4-511 |3 G:(DE-HGF)POF4 |2 G:(DE-HGF)POF4-500 |4 G:(DE-HGF)POF |v Enabling Computational- & Data-Intensive Science and Engineering |9 G:(DE-HGF)POF4-5111 |x 0 |
913 | 1 | _ | |a DE-HGF |b Key Technologies |l Engineering Digital Futures – Supercomputing, Data Management and Information Security for Knowledge and Action |1 G:(DE-HGF)POF4-510 |0 G:(DE-HGF)POF4-511 |3 G:(DE-HGF)POF4 |2 G:(DE-HGF)POF4-500 |4 G:(DE-HGF)POF |v Enabling Computational- & Data-Intensive Science and Engineering |9 G:(DE-HGF)POF4-5112 |x 1 |
914 | 1 | _ | |y 2023 |
915 | _ | _ | |a OpenAccess |0 StatID:(DE-HGF)0510 |2 StatID |
920 | 1 | _ | |0 I:(DE-Juel1)JSC-20090406 |k JSC |l Jülich Supercomputing Center |x 0 |
980 | _ | _ | |a contrib |
980 | _ | _ | |a VDB |
980 | _ | _ | |a UNRESTRICTED |
980 | _ | _ | |a contb |
980 | _ | _ | |a I:(DE-Juel1)JSC-20090406 |
980 | 1 | _ | |a FullTexts |
Library | Collection | CLSMajor | CLSMinor | Language | Author |
---|