| Home > Publications database > On the use of containers for machine learning and visualization workflows on JUWELS |
| Abstract | FZJ-2021-00743 |
; ; ;
2020
Abstract: Containers stock a single package of a code along with its dependencies so it can run reliably and efficiently in different computing environments. They promise the same level of isolation and security as a virtual machine and a higher degree of integration with the host operating system (OS). The main benefits of containers are, from a user perspective: greater software flexibility, reliability, ease of deployment, and portability. Containers have become very popular on cloud systems, but they have not been used much in HPC environments. In this study, we have tested the use of containers and measured the performance of the containerized workflow for two separate applications in the HPC system. In the first use case, we have automated the visualization process of global wildfire activity and the resulting “smoke” plumes from numerical model results of the Copernicus Atmosphere Monitoring System (https://www.ecmwf.int/en/about/what-we-do/environmental-services/copernicus-atmosphere-monitoring-service). The motivation for this workflow was to expedite the process of visualizing new fire situations without having to engage several people along the workflow from data extraction, data transformations, and the actual visualisation. Once, a container workflow is defined for this application, it can be easily adapted to work with other model variables, time periods, etc. Therefore, we built a container using the Singularity that includes the pre-processing of the visualization process for an arbitrary dataset. Preliminary results on the JUWELS system in the Jülich supercomputing center (JSC) have shown a satisfactory scaling of the application across multiple nodes. Work has begun to automate the full visualization process, including the ParaView application. For the second use-case, we have partially containerized the machine learning workflow in the context of weather and climate applications. In this proof of concept, we are adopting a deep learning architecture for video frame prediction to forecast the surface temperature fields over Europe for up to 20 hours based on ERA5 reanalysis data. Since this workflow requires immense data processing and the evaluation of various deep learning architectures, we have developed a containerized workflow for the full lifecycle of the application, which can run in parallel on several nodes. This containerized application uses Docker and Sarus and entails data extraction, data pre-processing, training, post-processing, and visualisation. The preliminary results of the containerized application on up to 8 nodes of the Piz Daint HPC system in the Swiss National Supercomputing center show a satisfactory level of scalability. In the next phase of this study, we will adopt the application to Singularity and will run it on the JUWELS system in JSC.
|
The record appears in these collections: |