Cargando…

Validation of Variant Assembly Using HAPHPIPE with Next-Generation Sequence Data from Viruses

Next-generation sequencing (NGS) offers a powerful opportunity to identify low-abundance, intra-host viral sequence variants, yet the focus of many bioinformatic tools on consensus sequence construction has precluded a thorough analysis of intra-host diversity. To take full advantage of the resoluti...

Descripción completa

Detalles Bibliográficos
Autores principales: Gibson, Keylie M., Steiner, Margaret C., Rentia, Uzma, Bendall, Matthew L., Pérez-Losada, Marcos, Crandall, Keith A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7412389/
https://www.ncbi.nlm.nih.gov/pubmed/32674515
http://dx.doi.org/10.3390/v12070758
_version_ 1783568596766556160
author Gibson, Keylie M.
Steiner, Margaret C.
Rentia, Uzma
Bendall, Matthew L.
Pérez-Losada, Marcos
Crandall, Keith A.
author_facet Gibson, Keylie M.
Steiner, Margaret C.
Rentia, Uzma
Bendall, Matthew L.
Pérez-Losada, Marcos
Crandall, Keith A.
author_sort Gibson, Keylie M.
collection PubMed
description Next-generation sequencing (NGS) offers a powerful opportunity to identify low-abundance, intra-host viral sequence variants, yet the focus of many bioinformatic tools on consensus sequence construction has precluded a thorough analysis of intra-host diversity. To take full advantage of the resolution of NGS data, we developed HAplotype PHylodynamics PIPEline (HAPHPIPE), an open-source tool for the de novo and reference-based assembly of viral NGS data, with both consensus sequence assembly and a focus on the quantification of intra-host variation through haplotype reconstruction. We validate and compare the consensus sequence assembly methods of HAPHPIPE to those of two alternative software packages, HyDRA and Geneious, using simulated HIV and empirical HIV, HCV, and SARS-CoV-2 datasets. Our validation methods included read mapping, genetic distance, and genetic diversity metrics. In simulated NGS data, HAPHPIPE generated pol consensus sequences significantly closer to the true consensus sequence than those produced by HyDRA and Geneious and performed comparably to Geneious for HIV gp120 sequences. Furthermore, using empirical data from multiple viruses, we demonstrate that HAPHPIPE can analyze larger sequence datasets due to its greater computational speed. Therefore, we contend that HAPHPIPE provides a more user-friendly platform for users with and without bioinformatics experience to implement current best practices for viral NGS assembly than other currently available options.
format Online
Article
Text
id pubmed-7412389
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-74123892020-08-26 Validation of Variant Assembly Using HAPHPIPE with Next-Generation Sequence Data from Viruses Gibson, Keylie M. Steiner, Margaret C. Rentia, Uzma Bendall, Matthew L. Pérez-Losada, Marcos Crandall, Keith A. Viruses Article Next-generation sequencing (NGS) offers a powerful opportunity to identify low-abundance, intra-host viral sequence variants, yet the focus of many bioinformatic tools on consensus sequence construction has precluded a thorough analysis of intra-host diversity. To take full advantage of the resolution of NGS data, we developed HAplotype PHylodynamics PIPEline (HAPHPIPE), an open-source tool for the de novo and reference-based assembly of viral NGS data, with both consensus sequence assembly and a focus on the quantification of intra-host variation through haplotype reconstruction. We validate and compare the consensus sequence assembly methods of HAPHPIPE to those of two alternative software packages, HyDRA and Geneious, using simulated HIV and empirical HIV, HCV, and SARS-CoV-2 datasets. Our validation methods included read mapping, genetic distance, and genetic diversity metrics. In simulated NGS data, HAPHPIPE generated pol consensus sequences significantly closer to the true consensus sequence than those produced by HyDRA and Geneious and performed comparably to Geneious for HIV gp120 sequences. Furthermore, using empirical data from multiple viruses, we demonstrate that HAPHPIPE can analyze larger sequence datasets due to its greater computational speed. Therefore, we contend that HAPHPIPE provides a more user-friendly platform for users with and without bioinformatics experience to implement current best practices for viral NGS assembly than other currently available options. MDPI 2020-07-14 /pmc/articles/PMC7412389/ /pubmed/32674515 http://dx.doi.org/10.3390/v12070758 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Gibson, Keylie M.
Steiner, Margaret C.
Rentia, Uzma
Bendall, Matthew L.
Pérez-Losada, Marcos
Crandall, Keith A.
Validation of Variant Assembly Using HAPHPIPE with Next-Generation Sequence Data from Viruses
title Validation of Variant Assembly Using HAPHPIPE with Next-Generation Sequence Data from Viruses
title_full Validation of Variant Assembly Using HAPHPIPE with Next-Generation Sequence Data from Viruses
title_fullStr Validation of Variant Assembly Using HAPHPIPE with Next-Generation Sequence Data from Viruses
title_full_unstemmed Validation of Variant Assembly Using HAPHPIPE with Next-Generation Sequence Data from Viruses
title_short Validation of Variant Assembly Using HAPHPIPE with Next-Generation Sequence Data from Viruses
title_sort validation of variant assembly using haphpipe with next-generation sequence data from viruses
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7412389/
https://www.ncbi.nlm.nih.gov/pubmed/32674515
http://dx.doi.org/10.3390/v12070758
work_keys_str_mv AT gibsonkeyliem validationofvariantassemblyusinghaphpipewithnextgenerationsequencedatafromviruses
AT steinermargaretc validationofvariantassemblyusinghaphpipewithnextgenerationsequencedatafromviruses
AT rentiauzma validationofvariantassemblyusinghaphpipewithnextgenerationsequencedatafromviruses
AT bendallmatthewl validationofvariantassemblyusinghaphpipewithnextgenerationsequencedatafromviruses
AT perezlosadamarcos validationofvariantassemblyusinghaphpipewithnextgenerationsequencedatafromviruses
AT crandallkeitha validationofvariantassemblyusinghaphpipewithnextgenerationsequencedatafromviruses