Semantically-enriched description of electrophysiology data analysis workflows

Köhler, Cristiano; Denker, Michael; Grün, Sonja
% IMPORTANT: The following is UTF-8 encoded.  This means that in the presence
% of non-ASCII characters, it will not work with BibTeX 0.99 or older.
% Instead, you should use an up-to-date BibTeX implementation like “bibtex8” or
% “biber”.

@INPROCEEDINGS{Khler:1027723,
      author       = {Köhler, Cristiano and Grün, Sonja and Denker, Michael},
      title        = {{S}emantically-enriched description of electrophysiology
                      data analysis workflows},
      reportid     = {FZJ-2024-04031},
      year         = {2024},
      abstract     = {Extracellular electrophysiology is a common experimental
                      technique to investigate brain function. Also, the outcome
                      of brain simulations on the neural network level can be
                      related to electrophysiological measures obtained from such
                      experiments. The analysis of electrophysiology data requires
                      specific methods of varying complexity, which are frequently
                      implemented in computational workflows that take the form of
                      a series of scripts that read input datasets and produce
                      result files [1]. There are challenges to describe the
                      processes throughout the workflow: (i) parameters are often
                      selected iteratively, and the subsequent results depend on
                      the details of the iterations; (ii) finding result files in
                      a collection is difficult, as several parameters may be
                      stored in a non machine-readable format inside files or
                      using non-standardized names; (iii) there are several
                      variations of analysis methods that can be applied to the
                      data with similar purposes (e.g., different algorithms to
                      compute the power spectral density from local field
                      potentials); (iv) an analysis method can be implemented by
                      different software codes (e.g., toolboxes such as Elephant
                      [2] or MNE [3]; see also [4]) that adopt different names for
                      the functions and their parameters. In the end, this
                      produces a scenario where it is difficult to find,
                      understand, and compare analysis results, especially in
                      collaborative environments where large results sets are
                      available in shared repositories. To overcome those
                      challenges, we developed a framework to generate
                      machine-readable descriptions of the workflow execution that
                      are enriched with the relevant semantic information. The
                      details of the inputs, outputs, and parameters of the
                      functions called within the workflow scripts are captured
                      with minimal user intervention using the Alpaca (Automatic
                      Lightweight Provenance Capture; $RRID:SCR_023739)$ [5]
                      toolbox. This produces a detailed record of the atomic steps
                      used to generate an analysis result. The provenance
                      information is enriched with annotations using the
                      Neuroelectrophysiology Analysis Ontology (NEAO), which we
                      developed as a unified vocabulary to standardize the
                      descriptions of the methods involved in the analysis of
                      extracellular electrophysiology data. We show real-world
                      examples where the framework was used to generate
                      machine-actionable descriptions of different analyses of an
                      electrophysiology dataset and highlight how it is possible
                      to query information, facilitating finding and obtaining
                      insights on the results (e.g., using knowledge graphs). In
                      the end, this approach improves the analysis workflow by
                      making the details of the results known and standardizing
                      their description. Ultimately, we discuss how this
                      methodology can be applied more generally to computational
                      workflows (e.g., in network simulation or AI applications)
                      to help representing the results according to the FAIR
                      principles [6]. REFERENCES [1] M. Denker and S. Grün, LNCS,
                      10087, 58, 2016. [2] M. Denker et al., Neuroinformatics
                      2018, P19, 2018. [3] A. Gramfort et al., Front. Neurosci.,
                      7, 267, 2013. [4] V. A. Unakafova and A. Gail, Front.
                      Neuroinform., 13, 57, 2019. [5] C. A. Köhler et al., arXiv,
                      2311.09672, 2023. [6] M. D. Wilkinson et al., Sci. Data, 3,
                      160018, 2016.},
      month         = {Jun},
      date          = {2024-06-03},
      organization  = {International Conference on
                       Neuromorphic Computing and Engineering,
                       Aachen (Germany), 3 Jun 2024 - 6 Jun
                       2024},
      subtyp        = {After Call},
      cin          = {IAS-6 / INM-10},
      cid          = {I:(DE-Juel1)IAS-6-20130828 / I:(DE-Juel1)INM-10-20170113},
      pnm          = {5235 - Digitization of Neuroscience and User-Community
                      Building (POF4-523) / 5231 - Neuroscientific Foundations
                      (POF4-523) / HDS LEE - Helmholtz School for Data Science in
                      Life, Earth and Energy (HDS LEE) (HDS-LEE-20190612) / HBP
                      SGA2 - Human Brain Project Specific Grant Agreement 2
                      (785907) / HBP SGA3 - Human Brain Project Specific Grant
                      Agreement 3 (945539) / EBRAINS 2.0 - EBRAINS 2.0: A Research
                      Infrastructure to Advance Neuroscience and Brain Health
                      (101147319) / Algorithms of Adaptive Behavior and their
                      Neuronal Implementation in Health and Disease
                      (iBehave-20220812)},
      pid          = {G:(DE-HGF)POF4-5235 / G:(DE-HGF)POF4-5231 /
                      G:(DE-Juel1)HDS-LEE-20190612 / G:(EU-Grant)785907 /
                      G:(EU-Grant)945539 / G:(EU-Grant)101147319 /
                      G:(DE-Juel-1)iBehave-20220812},
      typ          = {PUB:(DE-HGF)24},
      url          = {https://juser.fz-juelich.de/record/1027723},
}
guest :: login JuSER
		Search		Submit		Personalize Your alerts Your baskets Your searches		Help