TY  - CONF
AU  - Hoffstaedter, Felix
TI  - Reproducibility vs. computational efficiency on HPC systems
M1  - FZJ-2024-03087
PY  - 2024
AB  - HPC systems have particular hard- and software configurations that introduce specific challenges for the implementation of reproducible data processing workflows. The DataLad based 'FAIRly big workflow' allows for a separation of the compute environment from the processing pipeline enabling automatic reproducibility over systems. Yet, the sheer size of RAM and CPUs on HPC systems will allow for different ways to optimize compute jobs in contrast to compute clusters and certainly the average workstation/laptop. In this talk, I discuss general differences between HCP and more standard compute environments regarding necessary choices for the setup of processing pipelines to be reproducible. Among the main factors are the availability of RAM, local storage, inodes and wall clock time.
T2  - Distribits: technologies for distributed data management
CY  - 4 Apr 2024 - 4 Apr 2024, Düsseldorf (Germany)
Y2  - 4 Apr 2024 - 4 Apr 2024
M2  - Düsseldorf, Germany
LB  - PUB:(DE-HGF)6
UR  - https://juser.fz-juelich.de/record/1025704
ER  -