On the use of containers for machine learning and visualization workflows on JUWELS

Gong, Bing; Mozaffari, Amirpasha; Schultz, Martin; Vogelsang, Jan

Items
Marc 21

001			890154
005			20230127125338.0
037	_	_	\|a FZJ-2021-00743
041	_	_	\|a English
100	1	_	\|a Gong, Bing \|0 P:(DE-Juel1)177767 \|b 0 \|e Corresponding author
111	2	_	\|a NIC Symposium 2020 \|c Jülich \|d 2020-02-27 - 2020-02-28 \|w Germany
245	_	_	\|a On the use of containers for machine learning and visualization workflows on JUWELS
260	_	_	\|c 2020
336	7	_	\|a Abstract \|b abstract \|m abstract \|0 PUB:(DE-HGF)1 \|s 1611580634_2888 \|2 PUB:(DE-HGF)
336	7	_	\|a Conference Paper \|0 33 \|2 EndNote
336	7	_	\|a INPROCEEDINGS \|2 BibTeX
336	7	_	\|a conferenceObject \|2 DRIVER
336	7	_	\|a Output Types/Conference Abstract \|2 DataCite
336	7	_	\|a OTHER \|2 ORCID
520	_	_	\|a Containers stock a single package of a code along with its dependencies so it can run reliably and efficiently in different computing environments. They promise the same level of isolation and security as a virtual machine and a higher degree of integration with the host operating system (OS). The main benefits of containers are, from a user perspective: greater software flexibility, reliability, ease of deployment, and portability. Containers have become very popular on cloud systems, but they have not been used much in HPC environments. In this study, we have tested the use of containers and measured the performance of the containerized workflow for two separate applications in the HPC system. In the first use case, we have automated the visualization process of global wildfire activity and the resulting “smoke” plumes from numerical model results of the Copernicus Atmosphere Monitoring System (https://www.ecmwf.int/en/about/what-we-do/environmental-services/copernicus-atmosphere-monitoring-service). The motivation for this workflow was to expedite the process of visualizing new fire situations without having to engage several people along the workflow from data extraction, data transformations, and the actual visualisation. Once, a container workflow is defined for this application, it can be easily adapted to work with other model variables, time periods, etc. Therefore, we built a container using the Singularity that includes the pre-processing of the visualization process for an arbitrary dataset. Preliminary results on the JUWELS system in the Jülich supercomputing center (JSC) have shown a satisfactory scaling of the application across multiple nodes. Work has begun to automate the full visualization process, including the ParaView application. For the second use-case, we have partially containerized the machine learning workflow in the context of weather and climate applications. In this proof of concept, we are adopting a deep learning architecture for video frame prediction to forecast the surface temperature fields over Europe for up to 20 hours based on ERA5 reanalysis data. Since this workflow requires immense data processing and the evaluation of various deep learning architectures, we have developed a containerized workflow for the full lifecycle of the application, which can run in parallel on several nodes. This containerized application uses Docker and Sarus and entails data extraction, data pre-processing, training, post-processing, and visualisation. The preliminary results of the containerized application on up to 8 nodes of the Piz Daint HPC system in the Swiss National Supercomputing center show a satisfactory level of scalability. In the next phase of this study, we will adopt the application to Singularity and will run it on the JUWELS system in JSC.
536	_	_	\|a 512 - Data-Intensive Science and Federated Computing (POF3-512) \|0 G:(DE-HGF)POF3-512 \|c POF3-512 \|f POF III \|x 0
536	_	_	\|a IntelliAQ - Artificial Intelligence for Air Quality (787576) \|0 G:(EU-Grant)787576 \|c 787576 \|f ERC-2017-ADG \|x 1
536	_	_	\|0 G:(DE-Juel-1)ESDE \|a Earth System Data Exploration (ESDE) \|c ESDE \|x 2
700	1	_	\|a Vogelsang, Jan \|0 P:(DE-Juel1)173676 \|b 1
700	1	_	\|a Mozaffari, Amirpasha \|0 P:(DE-Juel1)166264 \|b 2
700	1	_	\|a Schultz, Martin \|0 P:(DE-Juel1)6952 \|b 3
909	C	O	\|o oai:juser.fz-juelich.de:890154 \|p openaire \|p VDB \|p ec_fundedresources
910	1	_	\|a Forschungszentrum Jülich \|0 I:(DE-588b)5008462-8 \|k FZJ \|b 0 \|6 P:(DE-Juel1)177767
910	1	_	\|a Forschungszentrum Jülich \|0 I:(DE-588b)5008462-8 \|k FZJ \|b 1 \|6 P:(DE-Juel1)173676
910	1	_	\|a Forschungszentrum Jülich \|0 I:(DE-588b)5008462-8 \|k FZJ \|b 2 \|6 P:(DE-Juel1)166264
910	1	_	\|a Forschungszentrum Jülich \|0 I:(DE-588b)5008462-8 \|k FZJ \|b 3 \|6 P:(DE-Juel1)6952
913	1	_	\|a DE-HGF \|b Key Technologies \|l Supercomputing & Big Data \|1 G:(DE-HGF)POF3-510 \|0 G:(DE-HGF)POF3-512 \|3 G:(DE-HGF)POF3 \|2 G:(DE-HGF)POF3-500 \|4 G:(DE-HGF)POF \|v Data-Intensive Science and Federated Computing \|x 0
914	1	_	\|y 2020
920	_	_	\|l yes
920	1	_	\|0 I:(DE-Juel1)JSC-20090406 \|k JSC \|l Jülich Supercomputing Center \|x 0
980	_	_	\|a abstract
980	_	_	\|a VDB
980	_	_	\|a I:(DE-Juel1)JSC-20090406
980	_	_	\|a UNRESTRICTED

Library	Collection	CLSMajor	CLSMinor	Language	Author

Marc 21

guest :: login JuSER
		Search		Submit		Personalize Your alerts Your baskets Your searches		Help