Cargando…

A spectrum of free software tools for processing the VCF variant call format: vcflib, bio-vcf, cyvcf2, hts-nim and slivar

Since its introduction in 2011 the variant call format (VCF) has been widely adopted for processing DNA and RNA variants in practically all population studies—as well as in somatic and germline mutation studies. The VCF format can represent single nucleotide variants, multi-nucleotide variants, inse...

Descripción completa

Detalles Bibliográficos
Autores principales: Garrison, Erik, Kronenberg, Zev N., Dawson, Eric T., Pedersen, Brent S., Prins, Pjotr
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9286226/
https://www.ncbi.nlm.nih.gov/pubmed/35639788
http://dx.doi.org/10.1371/journal.pcbi.1009123
_version_ 1784747960798543872
author Garrison, Erik
Kronenberg, Zev N.
Dawson, Eric T.
Pedersen, Brent S.
Prins, Pjotr
author_facet Garrison, Erik
Kronenberg, Zev N.
Dawson, Eric T.
Pedersen, Brent S.
Prins, Pjotr
author_sort Garrison, Erik
collection PubMed
description Since its introduction in 2011 the variant call format (VCF) has been widely adopted for processing DNA and RNA variants in practically all population studies—as well as in somatic and germline mutation studies. The VCF format can represent single nucleotide variants, multi-nucleotide variants, insertions and deletions, and simple structural variants called and anchored against a reference genome. Here we present a spectrum of over 125 useful, complimentary free and open source software tools and libraries, we wrote and made available through the multiple vcflib, bio-vcf, cyvcf2, hts-nim and slivar projects. These tools are applied for comparison, filtering, normalisation, smoothing and annotation of VCF, as well as output of statistics, visualisation, and transformations of files variants. These tools run everyday in critical biomedical pipelines and countless shell scripts. Our tools are part of the wider bioinformatics ecosystem and we highlight best practices. We shortly discuss the design of VCF, lessons learnt, and how we can address more complex variation through pangenome graph formats, variation that can not easily be represented by the VCF format.
format Online
Article
Text
id pubmed-9286226
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-92862262022-07-16 A spectrum of free software tools for processing the VCF variant call format: vcflib, bio-vcf, cyvcf2, hts-nim and slivar Garrison, Erik Kronenberg, Zev N. Dawson, Eric T. Pedersen, Brent S. Prins, Pjotr PLoS Comput Biol Research Article Since its introduction in 2011 the variant call format (VCF) has been widely adopted for processing DNA and RNA variants in practically all population studies—as well as in somatic and germline mutation studies. The VCF format can represent single nucleotide variants, multi-nucleotide variants, insertions and deletions, and simple structural variants called and anchored against a reference genome. Here we present a spectrum of over 125 useful, complimentary free and open source software tools and libraries, we wrote and made available through the multiple vcflib, bio-vcf, cyvcf2, hts-nim and slivar projects. These tools are applied for comparison, filtering, normalisation, smoothing and annotation of VCF, as well as output of statistics, visualisation, and transformations of files variants. These tools run everyday in critical biomedical pipelines and countless shell scripts. Our tools are part of the wider bioinformatics ecosystem and we highlight best practices. We shortly discuss the design of VCF, lessons learnt, and how we can address more complex variation through pangenome graph formats, variation that can not easily be represented by the VCF format. Public Library of Science 2022-05-31 /pmc/articles/PMC9286226/ /pubmed/35639788 http://dx.doi.org/10.1371/journal.pcbi.1009123 Text en © 2022 Garrison et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Garrison, Erik
Kronenberg, Zev N.
Dawson, Eric T.
Pedersen, Brent S.
Prins, Pjotr
A spectrum of free software tools for processing the VCF variant call format: vcflib, bio-vcf, cyvcf2, hts-nim and slivar
title A spectrum of free software tools for processing the VCF variant call format: vcflib, bio-vcf, cyvcf2, hts-nim and slivar
title_full A spectrum of free software tools for processing the VCF variant call format: vcflib, bio-vcf, cyvcf2, hts-nim and slivar
title_fullStr A spectrum of free software tools for processing the VCF variant call format: vcflib, bio-vcf, cyvcf2, hts-nim and slivar
title_full_unstemmed A spectrum of free software tools for processing the VCF variant call format: vcflib, bio-vcf, cyvcf2, hts-nim and slivar
title_short A spectrum of free software tools for processing the VCF variant call format: vcflib, bio-vcf, cyvcf2, hts-nim and slivar
title_sort spectrum of free software tools for processing the vcf variant call format: vcflib, bio-vcf, cyvcf2, hts-nim and slivar
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9286226/
https://www.ncbi.nlm.nih.gov/pubmed/35639788
http://dx.doi.org/10.1371/journal.pcbi.1009123
work_keys_str_mv AT garrisonerik aspectrumoffreesoftwaretoolsforprocessingthevcfvariantcallformatvcflibbiovcfcyvcf2htsnimandslivar
AT kronenbergzevn aspectrumoffreesoftwaretoolsforprocessingthevcfvariantcallformatvcflibbiovcfcyvcf2htsnimandslivar
AT dawsonerict aspectrumoffreesoftwaretoolsforprocessingthevcfvariantcallformatvcflibbiovcfcyvcf2htsnimandslivar
AT pedersenbrents aspectrumoffreesoftwaretoolsforprocessingthevcfvariantcallformatvcflibbiovcfcyvcf2htsnimandslivar
AT prinspjotr aspectrumoffreesoftwaretoolsforprocessingthevcfvariantcallformatvcflibbiovcfcyvcf2htsnimandslivar
AT garrisonerik spectrumoffreesoftwaretoolsforprocessingthevcfvariantcallformatvcflibbiovcfcyvcf2htsnimandslivar
AT kronenbergzevn spectrumoffreesoftwaretoolsforprocessingthevcfvariantcallformatvcflibbiovcfcyvcf2htsnimandslivar
AT dawsonerict spectrumoffreesoftwaretoolsforprocessingthevcfvariantcallformatvcflibbiovcfcyvcf2htsnimandslivar
AT pedersenbrents spectrumoffreesoftwaretoolsforprocessingthevcfvariantcallformatvcflibbiovcfcyvcf2htsnimandslivar
AT prinspjotr spectrumoffreesoftwaretoolsforprocessingthevcfvariantcallformatvcflibbiovcfcyvcf2htsnimandslivar