Capturing detailed provenance information in the analysis of electrophysiology data

Köhler, Cristiano; Denker, Michael; Ulianych, Danylo; Grün, Sonja; Gerkin, Richard C.; Davison, Andrew P.

Items
Marc 21

001			892689
005			20240313095022.0
037	_	_	\|a FZJ-2021-02267
100	1	_	\|a Köhler, Cristiano \|0 P:(DE-Juel1)180365 \|b 0 \|e Corresponding author
111	2	_	\|a 14th Göttingen Meeting of the German Neuroscience Society 2021 \|g NWG2021 \|c online \|d 2021-03-22 - 2021-03-30 \|w Germany
245	_	_	\|a Capturing detailed provenance information in the analysis of electrophysiology data
260	_	_	\|c 2021
336	7	_	\|a Conference Paper \|0 33 \|2 EndNote
336	7	_	\|a INPROCEEDINGS \|2 BibTeX
336	7	_	\|a conferenceObject \|2 DRIVER
336	7	_	\|a CONFERENCE_POSTER \|2 ORCID
336	7	_	\|a Output Types/Conference Poster \|2 DataCite
336	7	_	\|a Poster \|b poster \|m poster \|0 PUB:(DE-HGF)24 \|s 1628513565_30837 \|2 PUB:(DE-HGF) \|x Other
520	_	_	\|a The analysis of electrophysiology data typically comprises multiple steps. These often consist of several scripts executed in a specific sequence that take different parameter sets and use distinct data files. As the researcher adjusts the individual analysis steps to accommodate new hypotheses or additional data, the resulting workflows may become increasingly complex, and undergo frequent changes. Therefore, robust tools forming the workflows are necessary to fully document the workflow and improve the reproducibility of the results. Provenance refers to the characterization of data manipulations and corresponding parameters throughout the analysis [1]. It is possible to use workflow management systems to orchestrate the execution of the scripts and capture provenance information at the level of the script (i.e., which script file was executed, and in which environment?) and data file (i.e., which input and output files were supplied to that script). However, the resulting provenance track does not automatically provide details about the actual analysis carried out inside each script. Thus, analysis results can only be understood by source code inspection or trust in the correctness of any accompanying documentation. Here, we aim to improve existing tools by implementing a data model that captures detailed provenance information and by accurately representing the analysis results in a systematic and formalized manner. We focus on two open-source tools for the analysis of electrophysiology data. The Neo (RRID:SCR_000634) framework provides an object model to standardize neural activity data acquired from distinct sources [2]. Elephant (RRID:SCR_003833) is a Python toolbox that provides several functions for the analysis of electrophysiology data.3 We implemented prototypes of two complementary solutions to extend the functionality of Neo and Elephant to (i) automatically capture provenance information at the function-execution level inside a Python script, and to (ii) support the standardization of the analysis results together with the storage of relevant information describing their generation. The first solution is a set of data analysis objects that standardize the output of Elephant functions. They encapsulate all relevant parameters used by the function to generate the output, such that they can be easily re-used or shared. The second solution maps function inputs, outputs, and parameters throughout the execution of the Python analysis script, and builds a representation of the relationships between the different steps of the analysis within the script (i.e., the provenance trace). The captured information can be used to build a graph to visualize the steps followed in the script, and that can be stored together with the results as metadata. We compare the results obtained with or without the use of the two solutions on the basis of a realistic analysis scenario of electrophysiology data, showing the potential benefits for reproducibility, interoperability, discoverability, and re-use of analysis results. References: [1] Ragan et al. (2016) IEEE Trans Visual Comput Graphics 22:31. [2] Garcia et al. (2014) Front Neuroinform 8:10. [3] http://python-elephant.org.
536	_	_	\|a 5235 - Digitization of Neuroscience and User-Community Building (POF4-523) \|0 G:(DE-HGF)POF4-5235 \|c POF4-523 \|f POF IV \|x 0
536	_	_	\|a 5231 - Neuroscientific Foundations (POF4-523) \|0 G:(DE-HGF)POF4-5231 \|c POF4-523 \|f POF IV \|x 1
536	_	_	\|a 571 - Connectivity and Activity (POF3-571) \|0 G:(DE-HGF)POF3-571 \|c POF3-571 \|f POF III \|x 2
536	_	_	\|a 574 - Theory, modelling and simulation (POF3-574) \|0 G:(DE-HGF)POF3-574 \|c POF3-574 \|f POF III \|x 3
536	_	_	\|a HDS LEE - Helmholtz School for Data Science in Life, Earth and Energy (HDS LEE) (HDS-LEE-20190612) \|0 G:(DE-Juel1)HDS-LEE-20190612 \|c HDS-LEE-20190612 \|x 4
536	_	_	\|a HBP SGA2 - Human Brain Project Specific Grant Agreement 2 (785907) \|0 G:(EU-Grant)785907 \|c 785907 \|f H2020-SGA-FETFLAG-HBP-2017 \|x 5
536	_	_	\|a HBP SGA3 - Human Brain Project Specific Grant Agreement 3 (945539) \|0 G:(EU-Grant)945539 \|c 945539 \|x 6
536	_	_	\|a HAF - Helmholtz Analytics Framework (ZT-I-0003) \|0 G:(DE-HGF)ZT-I-0003 \|c ZT-I-0003 \|x 7
700	1	_	\|a Ulianych, Danylo \|0 P:(DE-Juel1)178793 \|b 1
700	1	_	\|a Gerkin, Richard C. \|0 P:(DE-HGF)0 \|b 2
700	1	_	\|a Davison, Andrew P. \|0 P:(DE-HGF)0 \|b 3
700	1	_	\|a Grün, Sonja \|0 P:(DE-Juel1)144168 \|b 4
700	1	_	\|a Denker, Michael \|0 P:(DE-Juel1)144807 \|b 5
909	C	O	\|o oai:juser.fz-juelich.de:892689 \|p openaire \|p VDB \|p ec_fundedresources
910	1	_	\|a Forschungszentrum Jülich \|0 I:(DE-588b)5008462-8 \|k FZJ \|b 0 \|6 P:(DE-Juel1)180365
910	1	_	\|a Forschungszentrum Jülich \|0 I:(DE-588b)5008462-8 \|k FZJ \|b 1 \|6 P:(DE-Juel1)178793
910	1	_	\|a Forschungszentrum Jülich \|0 I:(DE-588b)5008462-8 \|k FZJ \|b 4 \|6 P:(DE-Juel1)144168
910	1	_	\|a Forschungszentrum Jülich \|0 I:(DE-588b)5008462-8 \|k FZJ \|b 5 \|6 P:(DE-Juel1)144807
913	0	_	\|a DE-HGF \|b Key Technologies \|l Decoding the Human Brain \|1 G:(DE-HGF)POF3-570 \|0 G:(DE-HGF)POF3-571 \|3 G:(DE-HGF)POF3 \|2 G:(DE-HGF)POF3-500 \|4 G:(DE-HGF)POF \|v Connectivity and Activity \|x 0
913	0	_	\|a DE-HGF \|b Key Technologies \|l Decoding the Human Brain \|1 G:(DE-HGF)POF3-570 \|0 G:(DE-HGF)POF3-574 \|3 G:(DE-HGF)POF3 \|2 G:(DE-HGF)POF3-500 \|4 G:(DE-HGF)POF \|v Theory, modelling and simulation \|x 1
913	1	_	\|a DE-HGF \|b Key Technologies \|l Natural, Artificial and Cognitive Information Processing \|1 G:(DE-HGF)POF4-520 \|0 G:(DE-HGF)POF4-523 \|3 G:(DE-HGF)POF4 \|2 G:(DE-HGF)POF4-500 \|4 G:(DE-HGF)POF \|v Neuromorphic Computing and Network Dynamics \|9 G:(DE-HGF)POF4-5235 \|x 0
913	1	_	\|a DE-HGF \|b Key Technologies \|l Natural, Artificial and Cognitive Information Processing \|1 G:(DE-HGF)POF4-520 \|0 G:(DE-HGF)POF4-523 \|3 G:(DE-HGF)POF4 \|2 G:(DE-HGF)POF4-500 \|4 G:(DE-HGF)POF \|v Neuromorphic Computing and Network Dynamics \|9 G:(DE-HGF)POF4-5231 \|x 1
914	1	_	\|y 2021
920	_	_	\|l yes
920	1	_	\|0 I:(DE-Juel1)INM-6-20090406 \|k INM-6 \|l Computational and Systems Neuroscience \|x 0
920	1	_	\|0 I:(DE-Juel1)INM-10-20170113 \|k INM-10 \|l Jara-Institut Brain structure-function relationships \|x 1
920	1	_	\|0 I:(DE-Juel1)IAS-6-20130828 \|k IAS-6 \|l Theoretical Neuroscience \|x 2
980	_	_	\|a poster
980	_	_	\|a VDB
980	_	_	\|a I:(DE-Juel1)INM-6-20090406
980	_	_	\|a I:(DE-Juel1)INM-10-20170113
980	_	_	\|a I:(DE-Juel1)IAS-6-20130828
980	_	_	\|a UNRESTRICTED
981	_	_	\|a I:(DE-Juel1)IAS-6-20130828

Library	Collection	CLSMajor	CLSMinor	Language	Author

Marc 21

Gast :: Anmelden JuSER
		Suchen		Absenden		Personalisieren Ihre Benachrichtigungen Ihre Körbe Ihre Suchanfragen		Hilfe