TY - JOUR
AU - König, Patrick
AU - Beier, Sebastian
AU - Mascher, Martin
AU - Stein, Nils
AU - Lange, Matthias
AU - Scholz, Uwe
TI - DivBrowse—interactive visualization and exploratory data analysis of variant call matrices
JO - GigaScience
VL - 12
SN - 2047-217X
CY - Oxford
PB - Oxford University Press
M1 - FZJ-2023-01932
SP - giad025
PY - 2023
AB - Background: The sequencing of whole genomes is becoming increasingly affordable. In this context, large-scale sequencing projectsare generating ever larger datasets of species-specific genomic diversity. As a consequence, more and more genomic data need to bemade easily accessible and analyzable to the scientific community.Findings: We present DivBrowse, a web application for interactive visualization and exploratory analysis of genomic diversity datastored in Variant Call Format (VCF) files of any size. By seamlessly combining BLAST as an entry point together with interactive dataanalysis features such as principal component analysis in one graphical user interface, DivBrowse provides a novel and unique setof exploratory data analysis capabilities for genomic biodiversity datasets. The capability to integrate DivBrowse into existing webapplications supports interoperability between different web applications. Built-in interactive computation of principal componentanalysis allows users to perform ad hoc analysis of the population structure based on specific genetic elements such as genes andexons. Data interoperability is supported by the ability to export genomic diversity data in VCF and General Feature Format 3 files.Conclusion: DivBrowse offers a novel approach for interactive visualization and analysis of genomic diversity data and optionally alsogene annotation data by including features like interactive calculation of variant frequencies and principal component analysis. Theuse of established standard file formats for data input supports interoperability and seamless deployment of application instancesbased on the data output of established bioinformatics pipelines.
LB - PUB:(DE-HGF)16
C6 - 37083938
UR - <Go to ISI:>//WOS:001023598000001
DO - DOI:10.1093/gigascience/giad025
UR - https://juser.fz-juelich.de/record/1006993
ER -