001030590 001__ 1030590 001030590 005__ 20241014074411.0 001030590 0247_ $$2doi$$a10.48550/arXiv.2408.17309 001030590 0247_ $$2datacite_doi$$a10.34734/FZJ-2024-05343 001030590 037__ $$aFZJ-2024-05343 001030590 041__ $$aEnglish 001030590 088__ $$2arXiv$$a2408.17309 001030590 1001_ $$0P:(DE-Juel1)191583$$aVillamar, Jose$$b0$$eCorresponding author$$ufzj 001030590 245__ $$aMetadata practices for simulation workflows 001030590 260__ $$barXiv$$c2024 001030590 3367_ $$0PUB:(DE-HGF)25$$2PUB:(DE-HGF)$$aPreprint$$bpreprint$$mpreprint$$s1728368733_8447 001030590 3367_ $$2ORCID$$aWORKING_PAPER 001030590 3367_ $$028$$2EndNote$$aElectronic Article 001030590 3367_ $$2DRIVER$$apreprint 001030590 3367_ $$2BibTeX$$aARTICLE 001030590 3367_ $$2DataCite$$aOutput Types/Working Paper 001030590 520__ $$aComputer simulations are an essential pillar of knowledge generation in science.Understanding, reproducing, and exploring the results of simulations relies on tracking and organizing metadata describing numerical experiments.However, the models used to understand real-world systems, and the computational machinery required to simulate them, are typically complex, and produce large amounts of heterogeneous metadata.Here, we present general practices for acquiring and handling metadata that are agnostic to software and hardware, and highly flexible for the user.These consist of two steps: 1) recording and storing raw metadata, and 2) selecting and structuring metadata.As a proof of concept, we develop the Archivist, a Python tool to help with the second step, and use it to apply our practices to distinct high-performance computing use cases from neuroscience and hydrology.Our practices and the Archivist can readily be applied to existing workflows without the need for substantial restructuring.They support sustainable numerical workflows, facilitating reproducibility and data reuse in generic simulation-based research. 001030590 536__ $$0G:(DE-HGF)POF4-5232$$a5232 - Computational Principles (POF4-523)$$cPOF4-523$$fPOF IV$$x0 001030590 536__ $$0G:(DE-HGF)POF4-1121$$a1121 - Digitalization and Systems Technology for Flexibility Solutions (POF4-112)$$cPOF4-112$$fPOF IV$$x1 001030590 536__ $$0G:(DE-Juel-1)ZT-I-PF-3-026$$aMetaMoSim - Generic metadata management for reproducible high-performance-computing simulation workflows - MetaMoSim (ZT-I-PF-3-026)$$cZT-I-PF-3-026$$x2 001030590 536__ $$0G:(EU-Grant)101147319$$aEBRAINS 2.0 - EBRAINS 2.0: A Research Infrastructure to Advance Neuroscience and Brain Health (101147319)$$c101147319$$fHORIZON-INFRA-2022-SERV-B-01$$x3 001030590 536__ $$0G:(DE-Juel-1)HiRSE_PS-20220812$$aHelmholtz Platform for Research Software Engineering - Preparatory Study (HiRSE_PS-20220812)$$cHiRSE_PS-20220812$$x4 001030590 536__ $$0G:(DE-HGF)SO-092$$aACA - Advanced Computing Architectures (SO-092)$$cSO-092$$x5 001030590 536__ $$0G:(DE-Juel1)JL SMHB-2021-2027$$aJL SMHB - Joint Lab Supercomputing and Modeling for the Human Brain (JL SMHB-2021-2027)$$cJL SMHB-2021-2027$$x6 001030590 536__ $$0G:(DE-Juel1)jinb33_20220812$$aBrain-Scale Simulations (jinb33_20220812)$$cjinb33_20220812$$fBrain-Scale Simulations$$x7 001030590 536__ $$0G:(EU-Grant)800858$$aICEI - Interactive Computing E-Infrastructure for the Human Brain Project (800858)$$c800858$$fH2020-SGA-INFRA-FETFLAG-HBP$$x8 001030590 588__ $$aDataset connected to DataCite 001030590 650_7 $$2Other$$aInformation Retrieval (cs.IR) 001030590 650_7 $$2Other$$aFOS: Computer and information sciences 001030590 7001_ $$0P:(DE-HGF)0$$aKelbling, Matthias$$b1 001030590 7001_ $$0P:(DE-Juel1)190225$$aMore, Heather$$b2$$ufzj 001030590 7001_ $$0P:(DE-Juel1)144807$$aDenker, Michael$$b3$$ufzj 001030590 7001_ $$0P:(DE-Juel1)145211$$aTetzlaff, Tom$$b4$$ufzj 001030590 7001_ $$0P:(DE-Juel1)162130$$aSenk, Johanna$$b5$$ufzj 001030590 7001_ $$0P:(DE-HGF)0$$aThober, Stephan$$b6 001030590 773__ $$a10.48550/arXiv.2408.17309 001030590 8564_ $$uhttps://juser.fz-juelich.de/record/1030590/files/Manuscript.pdf$$yOpenAccess 001030590 8564_ $$uhttps://juser.fz-juelich.de/record/1030590/files/Manuscript.gif?subformat=icon$$xicon$$yOpenAccess 001030590 8564_ $$uhttps://juser.fz-juelich.de/record/1030590/files/Manuscript.jpg?subformat=icon-1440$$xicon-1440$$yOpenAccess 001030590 8564_ $$uhttps://juser.fz-juelich.de/record/1030590/files/Manuscript.jpg?subformat=icon-180$$xicon-180$$yOpenAccess 001030590 8564_ $$uhttps://juser.fz-juelich.de/record/1030590/files/Manuscript.jpg?subformat=icon-640$$xicon-640$$yOpenAccess 001030590 909CO $$ooai:juser.fz-juelich.de:1030590$$pdnbdelivery$$pec_fundedresources$$pVDB$$pdriver$$popen_access$$popenaire 001030590 915__ $$0StatID:(DE-HGF)0510$$2StatID$$aOpenAccess 001030590 9141_ $$y2024 001030590 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)191583$$aForschungszentrum Jülich$$b0$$kFZJ 001030590 9101_ $$0I:(DE-588b)36225-6$$6P:(DE-Juel1)191583$$aRWTH Aachen$$b0$$kRWTH 001030590 9101_ $$0I:(DE-HGF)0$$6P:(DE-HGF)0$$aDepartment of Computational Hydrosystems, Helmholtz-Centre for Environmental Research, Leipzig, Germany$$b1 001030590 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)190225$$aForschungszentrum Jülich$$b2$$kFZJ 001030590 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)144807$$aForschungszentrum Jülich$$b3$$kFZJ 001030590 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)145211$$aForschungszentrum Jülich$$b4$$kFZJ 001030590 9101_ $$0I:(DE-588b)5008462-8$$6P:(DE-Juel1)162130$$aForschungszentrum Jülich$$b5$$kFZJ 001030590 9101_ $$0I:(DE-HGF)0$$6P:(DE-HGF)0$$aDepartment of Computational Hydrosystems, Helmholtz-Centre for Environmental Research, Leipzig, Germany$$b6 001030590 9131_ $$0G:(DE-HGF)POF4-523$$1G:(DE-HGF)POF4-520$$2G:(DE-HGF)POF4-500$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-5232$$aDE-HGF$$bKey Technologies$$lNatural, Artificial and Cognitive Information Processing$$vNeuromorphic Computing and Network Dynamics$$x0 001030590 9131_ $$0G:(DE-HGF)POF4-112$$1G:(DE-HGF)POF4-110$$2G:(DE-HGF)POF4-100$$3G:(DE-HGF)POF4$$4G:(DE-HGF)POF$$9G:(DE-HGF)POF4-1121$$aDE-HGF$$bForschungsbereich Energie$$lEnergiesystemdesign (ESD)$$vDigitalisierung und Systemtechnik$$x1 001030590 920__ $$lyes 001030590 9201_ $$0I:(DE-Juel1)IAS-6-20130828$$kIAS-6$$lComputational and Systems Neuroscience$$x0 001030590 9201_ $$0I:(DE-Juel1)IAS-9-20201008$$kIAS-9$$lMaterials Data Science and Informatics$$x1 001030590 980__ $$apreprint 001030590 980__ $$aVDB 001030590 980__ $$aUNRESTRICTED 001030590 980__ $$aI:(DE-Juel1)IAS-6-20130828 001030590 980__ $$aI:(DE-Juel1)IAS-9-20201008 001030590 9801_ $$aFullTexts