001     908264
005     20230711160605.0
024 7 _ |2 doi
|a 10.12688/f1000research.109080.2
024 7 _ |2 Handle
|a 2128/31527
024 7 _ |2 altmetric
|a altmetric:128566175
037 _ _ |a FZJ-2022-02498
041 _ _ |a English
082 _ _ |a 610
100 1 _ |0 P:(DE-Juel1)191149
|a Beier, Sebastian
|b 0
245 _ _ |a Recommendations for the formatting of Variant Call Format (VCF) files to make plant genotyping data FAIR
260 _ _ |a London
|b F1000 Research Ltd
|c 2022
336 7 _ |2 DRIVER
|a article
336 7 _ |2 DataCite
|a Output Types/Journal article
336 7 _ |0 PUB:(DE-HGF)16
|2 PUB:(DE-HGF)
|a Journal Article
|b journal
|m journal
|s 1673254625_17708
336 7 _ |2 BibTeX
|a ARTICLE
336 7 _ |2 ORCID
|a JOURNAL_ARTICLE
336 7 _ |0 0
|2 EndNote
|a Journal Article
520 _ _ |a In this opinion article, we discuss the formatting of files from (plant) genotyping studies, in particular the formatting of metadata in Variant Call Format (VCF) files. The flexibility of the VCF format specification facilitates its use as a generic interchange format across domains but can lead to inconsistency between files in the presentation of metadata. To enable fully autonomous machine actionable data flow, generic elements need to be further specified.We strongly support the merits of the FAIR principles and see the need to facilitate them also through technical implementation specifications. They form a basis for the proposed VCF extensions here. We have learned from the existing application of VCF that the definition of relevant metadata using controlled standards, vocabulary and the consistent use of cross-references via resolvable identifiers (machine-readable) are particularly necessary and propose their encoding.VCF is an established standard for the exchange and publication of genotyping data. Other data formats are also used to capture variant data (for example, the HapMap and the gVCF formats), but none currently have the reach of VCF. For the sake of simplicity, we will only discuss VCF and our recommendations for its use, but these recommendations could also be applied to gVCF. However, the part of the VCF standard relating to metadata (as opposed to the actual variant calls) defines a syntactic format but no vocabulary, unique identifier or recommended content. In practice, often only sparse descriptive metadata is included. When descriptive metadata is provided, proprietary metadata fields are frequently added that have not been agreed upon within the community which may limit long-term and comprehensive interoperability. To address this, we propose recommendations for supplying and encoding metadata, focusing on use cases from plant sciences. We expect there to be overlap, but also divergence, with the needs of other domains.
536 _ _ |0 G:(DE-HGF)POF4-2171
|a 2171 - Biological and environmental resources for sustainable use (POF4-217)
|c POF4-217
|f POF IV
|x 0
536 _ _ |0 G:(EU-Grant)862613
|a AGENT - Activated GEnebank NeTwork (862613)
|c 862613
|f H2020-SFS-2019-2
|x 1
536 _ _ |0 G:(BMBF)031A536C
|a de.NBI - Etablierungsphase - Leistungszentrum - GCBN - German Crop BioGreenformatics Network (031A536C)
|c 031A536C
|x 2
588 _ _ |a Dataset connected to CrossRef, Journals: juser.fz-juelich.de
700 1 _ |0 0000-0003-3159-3593
|a Fiebig, Anne
|b 1
700 1 _ |0 0000-0002-9040-8733
|a Pommier, Cyril
|b 2
700 1 _ |0 0000-0002-4839-5158
|a Liyanage, Isuru
|b 3
700 1 _ |0 0000-0002-4316-078X
|a Lange, Matthias
|b 4
700 1 _ |0 P:(DE-HGF)0
|a Kersey, Paul J.
|b 5
700 1 _ |0 0000-0003-4031-9131
|a Weise, Stephan
|b 6
700 1 _ |0 0000-0002-4368-8058
|a Finkers, Richard
|b 7
700 1 _ |0 0000-0002-1187-8148
|a Koylass, Baron
|b 8
700 1 _ |0 0000-0002-5626-270X
|a Cezard, Timothee
|b 9
700 1 _ |0 0000-0002-9551-6370
|a Courtot, Mélanie
|b 10
700 1 _ |0 0000-0002-5462-907X
|a Contreras-Moreira, Bruno
|b 11
700 1 _ |0 P:(DE-HGF)0
|a Naamati, Guy
|b 12
700 1 _ |0 P:(DE-HGF)0
|a Dyer, Sarah
|b 13
700 1 _ |0 0000-0001-6113-3518
|a Scholz, Uwe
|b 14
|e Corresponding author
773 _ _ |0 PERI:(DE-600)2699932-8
|a 10.12688/f1000research.109080.2
|g Vol. 11, p. 231 -
|p 231 -
|t F1000Research
|v 11
|x 2046-1402
|y 2022
856 4 _ |u https://f1000research.com/articles/11-231/v2
856 4 _ |u https://juser.fz-juelich.de/record/908264/files/d93007ae-53fd-4438-b1a2-4fec2532da0b_109080_-_sebastian_beier.pdf
|y OpenAccess
909 C O |o oai:juser.fz-juelich.de:908264
|p openaire
|p open_access
|p driver
|p VDB
|p ec_fundedresources
|p dnbdelivery
910 1 _ |0 I:(DE-588b)5008462-8
|6 P:(DE-Juel1)191149
|a Forschungszentrum Jülich
|b 0
|k FZJ
910 1 _ |0 I:(DE-HGF)0
|6 P:(DE-Juel1)191149
|a IPK Gatersleben
|b 0
910 1 _ |0 I:(DE-HGF)0
|6 0000-0001-6113-3518
|a IPK Gatersleben
|b 14
913 1 _ |0 G:(DE-HGF)POF4-217
|1 G:(DE-HGF)POF4-210
|2 G:(DE-HGF)POF4-200
|3 G:(DE-HGF)POF4
|4 G:(DE-HGF)POF
|9 G:(DE-HGF)POF4-2171
|a DE-HGF
|b Forschungsbereich Erde und Umwelt
|l Erde im Wandel – Unsere Zukunft nachhaltig gestalten
|v Für eine nachhaltige Bio-Ökonomie – von Ressourcen zu Produkten
|x 0
914 1 _ |y 2022
915 _ _ |0 LIC:(DE-HGF)CCBYNV
|2 V:(DE-HGF)
|a Creative Commons Attribution CC BY (No Version)
|b DOAJ
|d 2020-09-04
915 _ _ |0 StatID:(DE-HGF)0510
|2 StatID
|a OpenAccess
915 _ _ |0 StatID:(DE-HGF)0561
|2 StatID
|a Article Processing Charges
|d 2020-09-04
915 _ _ |0 StatID:(DE-HGF)0700
|2 StatID
|a Fees
|d 2020-09-04
915 _ _ |0 StatID:(DE-HGF)0200
|2 StatID
|a DBCoverage
|b SCOPUS
|d 2022-11-16
915 _ _ |0 StatID:(DE-HGF)0300
|2 StatID
|a DBCoverage
|b Medline
|d 2022-11-16
915 _ _ |0 StatID:(DE-HGF)0501
|2 StatID
|a DBCoverage
|b DOAJ Seal
|d 2020-10-14T09:38:47Z
915 _ _ |0 StatID:(DE-HGF)0500
|2 StatID
|a DBCoverage
|b DOAJ
|d 2020-10-14T09:38:47Z
915 _ _ |0 StatID:(DE-HGF)0030
|2 StatID
|a Peer Review
|b DOAJ : Open peer review
|d 2020-10-14T09:38:47Z
920 _ _ |l yes
920 1 _ |0 I:(DE-Juel1)IBG-4-20200403
|k IBG-4
|l Bioinformatik
|x 0
980 _ _ |a journal
980 _ _ |a VDB
980 _ _ |a I:(DE-Juel1)IBG-4-20200403
980 _ _ |a UNRESTRICTED
980 _ _ |a OPENSCIENCE
980 1 _ |a FullTexts


LibraryCollectionCLSMajorCLSMinorLanguageAuthor
Marc 21