%0 Conference Paper
%A Szczepanik, Michał
%A Heunis, Stephan
%A Mönch, Christian
%A Wagner, Adina
%A Waite, Alexander Q.
%A Waite, Laura
%A Hanke, Michael
%T Distributed data management for large collaborative projects: DataLad ecosystem in Collaborative Research Center 1451
%M FZJ-2023-05235
%D 2023
%X Multi-site research projects offer a unique opportunity for scientific insight based on data collected across different modalities, paradigms, and species. Yet, they also pose unique research data management challenges. Here, we present software developments and lessons learned from the information management project of CRC1451. Given the large variability of RDM demands across over 20 CRC member projects, we opted for a decentralized approach: Projects retain full control over key data management decisions (standards, storage, sharing), and the findability, accessibility, interoperability, and reusability of their data is achieved with DataLad as an overlay structure for all distributed datasets. We use DataLad Catalog to generate an online data portal based on metadata. Metadata extraction is done using MetaLad, based on the 'capture immediately, curate perpetually' iterative approach. To mitigate DataLad’s limited adoption outside central projects, we are developing two solutions. First, DataLad Gooey is a graphical user interface for basic data management operations. Second, DataLad Tabby is a format specification and a collection of tools for dataset descriptions which can be created and provided as a spreadsheet, using well-defined terms, translatable to catalog records and linked data objects.
%B INCF Neuroinformatics Assembly 2023
%C 18 Sep 2023 - 20 Sep 2023, online (Sweden)
Y2 18 Sep 2023 - 20 Sep 2023
M2 online, Sweden
%F PUB:(DE-HGF)24
%9 Poster
%R 10.5281/ZENODO.8355962
%U https://juser.fz-juelich.de/record/1019189