Home > Publications database > Distributed data management for large collaborative projects: DataLad ecosystem in Collaborative Research Center 1451 |
Poster (After Call) | FZJ-2023-05235 |
; ; ; ; ; ;
2023
This record in other databases:
Please use a persistent id in citations: doi:10.5281/ZENODO.8355962 doi:10.34734/FZJ-2023-05235
Abstract: Multi-site research projects offer a unique opportunity for scientific insight based on data collected across different modalities, paradigms, and species. Yet, they also pose unique research data management challenges. Here, we present software developments and lessons learned from the information management project of CRC1451. Given the large variability of RDM demands across over 20 CRC member projects, we opted for a decentralized approach: Projects retain full control over key data management decisions (standards, storage, sharing), and the findability, accessibility, interoperability, and reusability of their data is achieved with DataLad as an overlay structure for all distributed datasets. We use DataLad Catalog to generate an online data portal based on metadata. Metadata extraction is done using MetaLad, based on the 'capture immediately, curate perpetually' iterative approach. To mitigate DataLad’s limited adoption outside central projects, we are developing two solutions. First, DataLad Gooey is a graphical user interface for basic data management operations. Second, DataLad Tabby is a format specification and a collection of tools for dataset descriptions which can be created and provided as a spreadsheet, using well-defined terms, translatable to catalog records and linked data objects.
![]() |
The record appears in these collections: |