Cargando…

Empirical validation of viral quasispecies assembly algorithms: state-of-the-art and challenges

Next generation sequencing (NGS) is superseding Sanger technology for analysing intra-host viral populations, in terms of genome length and resolution. We introduce two new empirical validation data sets and test the available viral population assembly software. Two intra-host viral population ‘quas...

Descripción completa

Detalles Bibliográficos
Autores principales: Prosperi, Mattia C. F., Yin, Li, Nolan, David J., Lowe, Amanda D., Goodenow, Maureen M., Salemi, Marco
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3789152/
https://www.ncbi.nlm.nih.gov/pubmed/24089188
http://dx.doi.org/10.1038/srep02837
_version_ 1782286402727182336
author Prosperi, Mattia C. F.
Yin, Li
Nolan, David J.
Lowe, Amanda D.
Goodenow, Maureen M.
Salemi, Marco
author_facet Prosperi, Mattia C. F.
Yin, Li
Nolan, David J.
Lowe, Amanda D.
Goodenow, Maureen M.
Salemi, Marco
author_sort Prosperi, Mattia C. F.
collection PubMed
description Next generation sequencing (NGS) is superseding Sanger technology for analysing intra-host viral populations, in terms of genome length and resolution. We introduce two new empirical validation data sets and test the available viral population assembly software. Two intra-host viral population ‘quasispecies’ samples (type-1 human immunodeficiency and hepatitis C virus) were Sanger-sequenced, and plasmid clone mixtures at controlled proportions were shotgun-sequenced using Roche's 454 sequencing platform. The performance of different assemblers was compared in terms of phylogenetic clustering and recombination with the Sanger clones. Phylogenetic clustering showed that all assemblers captured a proportion of the most divergent lineages, but none were able to provide a high precision/recall tradeoff. Estimated variant frequencies mildly correlated with the original. Given the limitations of currently available algorithms identified by our empirical validation, the development and exploitation of additional data sets is needed, in order to establish an efficient framework for viral population reconstruction using NGS.
format Online
Article
Text
id pubmed-3789152
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Nature Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-37891522013-10-18 Empirical validation of viral quasispecies assembly algorithms: state-of-the-art and challenges Prosperi, Mattia C. F. Yin, Li Nolan, David J. Lowe, Amanda D. Goodenow, Maureen M. Salemi, Marco Sci Rep Article Next generation sequencing (NGS) is superseding Sanger technology for analysing intra-host viral populations, in terms of genome length and resolution. We introduce two new empirical validation data sets and test the available viral population assembly software. Two intra-host viral population ‘quasispecies’ samples (type-1 human immunodeficiency and hepatitis C virus) were Sanger-sequenced, and plasmid clone mixtures at controlled proportions were shotgun-sequenced using Roche's 454 sequencing platform. The performance of different assemblers was compared in terms of phylogenetic clustering and recombination with the Sanger clones. Phylogenetic clustering showed that all assemblers captured a proportion of the most divergent lineages, but none were able to provide a high precision/recall tradeoff. Estimated variant frequencies mildly correlated with the original. Given the limitations of currently available algorithms identified by our empirical validation, the development and exploitation of additional data sets is needed, in order to establish an efficient framework for viral population reconstruction using NGS. Nature Publishing Group 2013-10-03 /pmc/articles/PMC3789152/ /pubmed/24089188 http://dx.doi.org/10.1038/srep02837 Text en Copyright © 2013, Macmillan Publishers Limited. All rights reserved http://creativecommons.org/licenses/by/3.0/ This work is licensed under a Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0/
spellingShingle Article
Prosperi, Mattia C. F.
Yin, Li
Nolan, David J.
Lowe, Amanda D.
Goodenow, Maureen M.
Salemi, Marco
Empirical validation of viral quasispecies assembly algorithms: state-of-the-art and challenges
title Empirical validation of viral quasispecies assembly algorithms: state-of-the-art and challenges
title_full Empirical validation of viral quasispecies assembly algorithms: state-of-the-art and challenges
title_fullStr Empirical validation of viral quasispecies assembly algorithms: state-of-the-art and challenges
title_full_unstemmed Empirical validation of viral quasispecies assembly algorithms: state-of-the-art and challenges
title_short Empirical validation of viral quasispecies assembly algorithms: state-of-the-art and challenges
title_sort empirical validation of viral quasispecies assembly algorithms: state-of-the-art and challenges
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3789152/
https://www.ncbi.nlm.nih.gov/pubmed/24089188
http://dx.doi.org/10.1038/srep02837
work_keys_str_mv AT prosperimattiacf empiricalvalidationofviralquasispeciesassemblyalgorithmsstateoftheartandchallenges
AT yinli empiricalvalidationofviralquasispeciesassemblyalgorithmsstateoftheartandchallenges
AT nolandavidj empiricalvalidationofviralquasispeciesassemblyalgorithmsstateoftheartandchallenges
AT loweamandad empiricalvalidationofviralquasispeciesassemblyalgorithmsstateoftheartandchallenges
AT goodenowmaureenm empiricalvalidationofviralquasispeciesassemblyalgorithmsstateoftheartandchallenges
AT salemimarco empiricalvalidationofviralquasispeciesassemblyalgorithmsstateoftheartandchallenges