Cargando…
A spectrum of free software tools for processing the VCF variant call format: vcflib, bio-vcf, cyvcf2, hts-nim and slivar
Since its introduction in 2011 the variant call format (VCF) has been widely adopted for processing DNA and RNA variants in practically all population studies—as well as in somatic and germline mutation studies. The VCF format can represent single nucleotide variants, multi-nucleotide variants, inse...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9286226/ https://www.ncbi.nlm.nih.gov/pubmed/35639788 http://dx.doi.org/10.1371/journal.pcbi.1009123 |
_version_ | 1784747960798543872 |
---|---|
author | Garrison, Erik Kronenberg, Zev N. Dawson, Eric T. Pedersen, Brent S. Prins, Pjotr |
author_facet | Garrison, Erik Kronenberg, Zev N. Dawson, Eric T. Pedersen, Brent S. Prins, Pjotr |
author_sort | Garrison, Erik |
collection | PubMed |
description | Since its introduction in 2011 the variant call format (VCF) has been widely adopted for processing DNA and RNA variants in practically all population studies—as well as in somatic and germline mutation studies. The VCF format can represent single nucleotide variants, multi-nucleotide variants, insertions and deletions, and simple structural variants called and anchored against a reference genome. Here we present a spectrum of over 125 useful, complimentary free and open source software tools and libraries, we wrote and made available through the multiple vcflib, bio-vcf, cyvcf2, hts-nim and slivar projects. These tools are applied for comparison, filtering, normalisation, smoothing and annotation of VCF, as well as output of statistics, visualisation, and transformations of files variants. These tools run everyday in critical biomedical pipelines and countless shell scripts. Our tools are part of the wider bioinformatics ecosystem and we highlight best practices. We shortly discuss the design of VCF, lessons learnt, and how we can address more complex variation through pangenome graph formats, variation that can not easily be represented by the VCF format. |
format | Online Article Text |
id | pubmed-9286226 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-92862262022-07-16 A spectrum of free software tools for processing the VCF variant call format: vcflib, bio-vcf, cyvcf2, hts-nim and slivar Garrison, Erik Kronenberg, Zev N. Dawson, Eric T. Pedersen, Brent S. Prins, Pjotr PLoS Comput Biol Research Article Since its introduction in 2011 the variant call format (VCF) has been widely adopted for processing DNA and RNA variants in practically all population studies—as well as in somatic and germline mutation studies. The VCF format can represent single nucleotide variants, multi-nucleotide variants, insertions and deletions, and simple structural variants called and anchored against a reference genome. Here we present a spectrum of over 125 useful, complimentary free and open source software tools and libraries, we wrote and made available through the multiple vcflib, bio-vcf, cyvcf2, hts-nim and slivar projects. These tools are applied for comparison, filtering, normalisation, smoothing and annotation of VCF, as well as output of statistics, visualisation, and transformations of files variants. These tools run everyday in critical biomedical pipelines and countless shell scripts. Our tools are part of the wider bioinformatics ecosystem and we highlight best practices. We shortly discuss the design of VCF, lessons learnt, and how we can address more complex variation through pangenome graph formats, variation that can not easily be represented by the VCF format. Public Library of Science 2022-05-31 /pmc/articles/PMC9286226/ /pubmed/35639788 http://dx.doi.org/10.1371/journal.pcbi.1009123 Text en © 2022 Garrison et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Garrison, Erik Kronenberg, Zev N. Dawson, Eric T. Pedersen, Brent S. Prins, Pjotr A spectrum of free software tools for processing the VCF variant call format: vcflib, bio-vcf, cyvcf2, hts-nim and slivar |
title | A spectrum of free software tools for processing the VCF variant call format: vcflib, bio-vcf, cyvcf2, hts-nim and slivar |
title_full | A spectrum of free software tools for processing the VCF variant call format: vcflib, bio-vcf, cyvcf2, hts-nim and slivar |
title_fullStr | A spectrum of free software tools for processing the VCF variant call format: vcflib, bio-vcf, cyvcf2, hts-nim and slivar |
title_full_unstemmed | A spectrum of free software tools for processing the VCF variant call format: vcflib, bio-vcf, cyvcf2, hts-nim and slivar |
title_short | A spectrum of free software tools for processing the VCF variant call format: vcflib, bio-vcf, cyvcf2, hts-nim and slivar |
title_sort | spectrum of free software tools for processing the vcf variant call format: vcflib, bio-vcf, cyvcf2, hts-nim and slivar |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9286226/ https://www.ncbi.nlm.nih.gov/pubmed/35639788 http://dx.doi.org/10.1371/journal.pcbi.1009123 |
work_keys_str_mv | AT garrisonerik aspectrumoffreesoftwaretoolsforprocessingthevcfvariantcallformatvcflibbiovcfcyvcf2htsnimandslivar AT kronenbergzevn aspectrumoffreesoftwaretoolsforprocessingthevcfvariantcallformatvcflibbiovcfcyvcf2htsnimandslivar AT dawsonerict aspectrumoffreesoftwaretoolsforprocessingthevcfvariantcallformatvcflibbiovcfcyvcf2htsnimandslivar AT pedersenbrents aspectrumoffreesoftwaretoolsforprocessingthevcfvariantcallformatvcflibbiovcfcyvcf2htsnimandslivar AT prinspjotr aspectrumoffreesoftwaretoolsforprocessingthevcfvariantcallformatvcflibbiovcfcyvcf2htsnimandslivar AT garrisonerik spectrumoffreesoftwaretoolsforprocessingthevcfvariantcallformatvcflibbiovcfcyvcf2htsnimandslivar AT kronenbergzevn spectrumoffreesoftwaretoolsforprocessingthevcfvariantcallformatvcflibbiovcfcyvcf2htsnimandslivar AT dawsonerict spectrumoffreesoftwaretoolsforprocessingthevcfvariantcallformatvcflibbiovcfcyvcf2htsnimandslivar AT pedersenbrents spectrumoffreesoftwaretoolsforprocessingthevcfvariantcallformatvcflibbiovcfcyvcf2htsnimandslivar AT prinspjotr spectrumoffreesoftwaretoolsforprocessingthevcfvariantcallformatvcflibbiovcfcyvcf2htsnimandslivar |