Conference Presentation (Other) FZJ-2024-03087

http://join2-wiki.gsi.de/foswiki/pub/Main/Artwork/join2_logo100x88.png
Reproducibility vs. computational efficiency on HPC systems



2024

Distribits: technologies for distributed data management, DüsseldorfDüsseldorf, Germany, 4 Apr 2024 - 4 Apr 20242024-04-042024-04-04

Abstract: HPC systems have particular hard- and software configurations that introduce specific challenges for the implementation of reproducible data processing workflows. The DataLad based 'FAIRly big workflow' allows for a separation of the compute environment from the processing pipeline enabling automatic reproducibility over systems. Yet, the sheer size of RAM and CPUs on HPC systems will allow for different ways to optimize compute jobs in contrast to compute clusters and certainly the average workstation/laptop. In this talk, I discuss general differences between HCP and more standard compute environments regarding necessary choices for the setup of processing pipelines to be reproducible. Among the main factors are the availability of RAM, local storage, inodes and wall clock time.


Contributing Institute(s):
  1. Gehirn & Verhalten (INM-7)
Research Program(s):
  1. 5254 - Neuroscientific Data Analytics and AI (POF4-525) (POF4-525)

Appears in the scientific report 2024
Click to display QR Code for this record

The record appears in these collections:
Dokumenttypen > Präsentationen > Konferenzvorträge
Institutssammlungen > INM > INM-7
Workflowsammlungen > Öffentliche Einträge
Publikationsdatenbank

 Datensatz erzeugt am 2024-04-23, letzte Änderung am 2024-05-06



Dieses Dokument bewerten:

Rate this document:
1
2
3
 
(Bisher nicht rezensiert)