Cargando…
V-Phaser 2: variant inference for viral populations
BACKGROUND: Massively parallel sequencing offers the possibility of revolutionizing the study of viral populations by providing ultra deep sequencing (tens to hundreds of thousand fold coverage) of complete viral genomes. However, differentiation of true low frequency variants from sequencing errors...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3907024/ https://www.ncbi.nlm.nih.gov/pubmed/24088188 http://dx.doi.org/10.1186/1471-2164-14-674 |
_version_ | 1782301556144603136 |
---|---|
author | Yang, Xiao Charlebois, Patrick Macalalad, Alex Henn, Matthew R Zody, Michael C |
author_facet | Yang, Xiao Charlebois, Patrick Macalalad, Alex Henn, Matthew R Zody, Michael C |
author_sort | Yang, Xiao |
collection | PubMed |
description | BACKGROUND: Massively parallel sequencing offers the possibility of revolutionizing the study of viral populations by providing ultra deep sequencing (tens to hundreds of thousand fold coverage) of complete viral genomes. However, differentiation of true low frequency variants from sequencing errors remains challenging. RESULTS: We developed a software package, V-Phaser 2, for inferring intrahost diversity within viral populations. This program adds three major new methodologies to the state of the art: a technique to efficiently utilize paired end read data for calling phased variants, a new strategy to represent and infer length polymorphisms, and an in line filter for erroneous calls arising from systematic sequencing artifacts. We have also heavily optimized memory and run time performance. This combination of algorithmic and technical advances allows V-Phaser 2 to fully utilize extremely deep paired end sequencing data (such as generated by Illumina sequencers) to accurately infer low frequency intrahost variants in viral populations in reasonable time on a standard desktop computer. V-Phaser 2 was validated and compared to both QuRe and the original V-Phaser on three datasets obtained from two viral populations: a mixture of eight known strains of West Nile Virus (WNV) sequenced on both 454 Titanium and Illumina MiSeq and a mixture of twenty-four known strains of WNV sequenced only on 454 Titanium. V-Phaser 2 outperformed the other two programs in both sensitivity and specificity while using more than five fold less time and memory. CONCLUSIONS: We developed V-Phaser 2, a publicly available software tool (V-Phaser 2 can be accessed via: http://www.broadinstitute.org/scientific-community/science/projects/viral-genomics/v-phaser-2 and is freely available for academic use) that enables the efficient analysis of ultra-deep sequencing data produced by common next generation sequencing platforms for viral populations. |
format | Online Article Text |
id | pubmed-3907024 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-39070242014-02-12 V-Phaser 2: variant inference for viral populations Yang, Xiao Charlebois, Patrick Macalalad, Alex Henn, Matthew R Zody, Michael C BMC Genomics Methodology Article BACKGROUND: Massively parallel sequencing offers the possibility of revolutionizing the study of viral populations by providing ultra deep sequencing (tens to hundreds of thousand fold coverage) of complete viral genomes. However, differentiation of true low frequency variants from sequencing errors remains challenging. RESULTS: We developed a software package, V-Phaser 2, for inferring intrahost diversity within viral populations. This program adds three major new methodologies to the state of the art: a technique to efficiently utilize paired end read data for calling phased variants, a new strategy to represent and infer length polymorphisms, and an in line filter for erroneous calls arising from systematic sequencing artifacts. We have also heavily optimized memory and run time performance. This combination of algorithmic and technical advances allows V-Phaser 2 to fully utilize extremely deep paired end sequencing data (such as generated by Illumina sequencers) to accurately infer low frequency intrahost variants in viral populations in reasonable time on a standard desktop computer. V-Phaser 2 was validated and compared to both QuRe and the original V-Phaser on three datasets obtained from two viral populations: a mixture of eight known strains of West Nile Virus (WNV) sequenced on both 454 Titanium and Illumina MiSeq and a mixture of twenty-four known strains of WNV sequenced only on 454 Titanium. V-Phaser 2 outperformed the other two programs in both sensitivity and specificity while using more than five fold less time and memory. CONCLUSIONS: We developed V-Phaser 2, a publicly available software tool (V-Phaser 2 can be accessed via: http://www.broadinstitute.org/scientific-community/science/projects/viral-genomics/v-phaser-2 and is freely available for academic use) that enables the efficient analysis of ultra-deep sequencing data produced by common next generation sequencing platforms for viral populations. BioMed Central 2013-10-03 /pmc/articles/PMC3907024/ /pubmed/24088188 http://dx.doi.org/10.1186/1471-2164-14-674 Text en Copyright © 2013 Yang et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methodology Article Yang, Xiao Charlebois, Patrick Macalalad, Alex Henn, Matthew R Zody, Michael C V-Phaser 2: variant inference for viral populations |
title | V-Phaser 2: variant inference for viral populations |
title_full | V-Phaser 2: variant inference for viral populations |
title_fullStr | V-Phaser 2: variant inference for viral populations |
title_full_unstemmed | V-Phaser 2: variant inference for viral populations |
title_short | V-Phaser 2: variant inference for viral populations |
title_sort | v-phaser 2: variant inference for viral populations |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3907024/ https://www.ncbi.nlm.nih.gov/pubmed/24088188 http://dx.doi.org/10.1186/1471-2164-14-674 |
work_keys_str_mv | AT yangxiao vphaser2variantinferenceforviralpopulations AT charleboispatrick vphaser2variantinferenceforviralpopulations AT macalaladalex vphaser2variantinferenceforviralpopulations AT hennmatthewr vphaser2variantinferenceforviralpopulations AT zodymichaelc vphaser2variantinferenceforviralpopulations |