Home > Publications database > Using Provenance for FAIR Sharing of Results of Workflows Analyzing Neural Activity Data |
Conference Presentation (Invited) | FZJ-2025-03264 |
2025
Abstract: Computational workflows for the analysis of electrophysiology data are often implemented as scripts that read input datasets and produce result files [1]. As available data grows in size due to technological advances, collaboration between researchers becomes increasingly important to handle the involved analysis complexity. Moreover, the derived analysis outputs are more valuable for reuse because modern analysis methods often require competitive high-performance compute resources. To facilitate these structural changes in a way that scientists engage in data analysis, we consider the FAIR-ness [2] of results of computational analysis workflows that are derived from experimental data. To do so, we investigate descriptions of the analysis process beyond source codes and free-text descriptions that enable query, introspection and seamless reuse of the results. We identify multiple challenges throughout the workflow that complicate the generation of such descriptions due to the iterative nature of analysis scenarios, lacking technical and semantic standardization, and the availability of competing software implementations [3].To address those challenges, we developed Alpaca (Automatic Lightweight Provenance Capture) as a framework to generate machine-readable descriptions of the workflow execution that are enriched with the relevant semantic information with minimal user intervention [4]. This produces a detailed provenance record of the atomic analysis steps using the W3C PROV standard [5]. The record is enriched using the Neuroelectrophysiology Analysis Ontology (NEAO), which we developed as a unified vocabulary to standardize the descriptions of the methods involved in the analysis of extracellular electrophysiology data [6]. We show how to obtain insights on the results (e.g., using knowledge graphs) of real-world workflows for analyzing and comparing heterogeneous data based on Elephant [7] and Cobrawap [8]. We discuss extensions to other computational workflows (e.g., neural simulation) to help in representing their results according to the FAIR principles.[1] Denker et al., 2021. Neuroforum 27, 27[2] Wilkinson et al., 2016. Sci Data 3, 160018[3] Unakafova and Gail, 2019. Front Neuroinf., 13, 57[4] Köhler et al., 2024. eNeuro 11, ENEURO.0476-23.2024[5] https://www.w3.org/TR/prov-overview[6] Köhler et al., 2024. arXiv:2412.05021[7] Denker et al., 2018. Neuroinformatics 2018, doi:10.12751/incf.ni2018.0019[8] Gutzen et al., 2024. Cell Rep Meth 4, 100681
![]() |
The record appears in these collections: |