Cargando…

Full-Length Envelope Analyzer (FLEA): A tool for longitudinal analysis of viral amplicons

Next generation sequencing of viral populations has advanced our understanding of viral population dynamics, the development of drug resistance, and escape from host immune responses. Many applications require complete gene sequences, which can be impossible to reconstruct from short reads. HIV env,...

Descripción completa

Detalles Bibliográficos
Autores principales: Eren, Kemal, Weaver, Steven, Ketteringham, Robert, Valentyn, Morné, Laird Smith, Melissa, Kumar, Venkatesh, Mohan, Sanjay, Kosakovsky Pond, Sergei L., Murrell, Ben
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6314628/
https://www.ncbi.nlm.nih.gov/pubmed/30543621
http://dx.doi.org/10.1371/journal.pcbi.1006498
_version_ 1783384133722963968
author Eren, Kemal
Weaver, Steven
Ketteringham, Robert
Valentyn, Morné
Laird Smith, Melissa
Kumar, Venkatesh
Mohan, Sanjay
Kosakovsky Pond, Sergei L.
Murrell, Ben
author_facet Eren, Kemal
Weaver, Steven
Ketteringham, Robert
Valentyn, Morné
Laird Smith, Melissa
Kumar, Venkatesh
Mohan, Sanjay
Kosakovsky Pond, Sergei L.
Murrell, Ben
author_sort Eren, Kemal
collection PubMed
description Next generation sequencing of viral populations has advanced our understanding of viral population dynamics, the development of drug resistance, and escape from host immune responses. Many applications require complete gene sequences, which can be impossible to reconstruct from short reads. HIV env, the protein of interest for HIV vaccine studies, is exceptionally challenging for long-read sequencing and analysis due to its length, high substitution rate, and extensive indel variation. While long-read sequencing is attractive in this setting, the analysis of such data is not well handled by existing methods. To address this, we introduce FLEA (Full-Length Envelope Analyzer), which performs end-to-end analysis and visualization of long-read sequencing data. FLEA consists of both a pipeline (optionally run on a high-performance cluster), and a client-side web application that provides interactive results. The pipeline transforms FASTQ reads into high-quality consensus sequences (HQCSs) and uses them to build a codon-aware multiple sequence alignment. The resulting alignment is then used to infer phylogenies, selection pressure, and evolutionary dynamics. The web application provides publication-quality plots and interactive visualizations, including an annotated viral alignment browser, time series plots of evolutionary dynamics, visualizations of gene-wide selective pressures (such as dN/dS) across time and across protein structure, and a phylogenetic tree browser. We demonstrate how FLEA may be used to process Pacific Biosciences HIV env data and describe recent examples of its use. Simulations show how FLEA dramatically reduces the error rate of this sequencing platform, providing an accurate portrait of complex and variable HIV env populations. A public instance of FLEA is hosted at http://flea.datamonkey.org. The Python source code for the FLEA pipeline can be found at https://github.com/veg/flea-pipeline. The client-side application is available at https://github.com/veg/flea-web-app. A live demo of the P018 results can be found at http://flea.murrell.group/view/P018.
format Online
Article
Text
id pubmed-6314628
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-63146282019-01-11 Full-Length Envelope Analyzer (FLEA): A tool for longitudinal analysis of viral amplicons Eren, Kemal Weaver, Steven Ketteringham, Robert Valentyn, Morné Laird Smith, Melissa Kumar, Venkatesh Mohan, Sanjay Kosakovsky Pond, Sergei L. Murrell, Ben PLoS Comput Biol Research Article Next generation sequencing of viral populations has advanced our understanding of viral population dynamics, the development of drug resistance, and escape from host immune responses. Many applications require complete gene sequences, which can be impossible to reconstruct from short reads. HIV env, the protein of interest for HIV vaccine studies, is exceptionally challenging for long-read sequencing and analysis due to its length, high substitution rate, and extensive indel variation. While long-read sequencing is attractive in this setting, the analysis of such data is not well handled by existing methods. To address this, we introduce FLEA (Full-Length Envelope Analyzer), which performs end-to-end analysis and visualization of long-read sequencing data. FLEA consists of both a pipeline (optionally run on a high-performance cluster), and a client-side web application that provides interactive results. The pipeline transforms FASTQ reads into high-quality consensus sequences (HQCSs) and uses them to build a codon-aware multiple sequence alignment. The resulting alignment is then used to infer phylogenies, selection pressure, and evolutionary dynamics. The web application provides publication-quality plots and interactive visualizations, including an annotated viral alignment browser, time series plots of evolutionary dynamics, visualizations of gene-wide selective pressures (such as dN/dS) across time and across protein structure, and a phylogenetic tree browser. We demonstrate how FLEA may be used to process Pacific Biosciences HIV env data and describe recent examples of its use. Simulations show how FLEA dramatically reduces the error rate of this sequencing platform, providing an accurate portrait of complex and variable HIV env populations. A public instance of FLEA is hosted at http://flea.datamonkey.org. The Python source code for the FLEA pipeline can be found at https://github.com/veg/flea-pipeline. The client-side application is available at https://github.com/veg/flea-web-app. A live demo of the P018 results can be found at http://flea.murrell.group/view/P018. Public Library of Science 2018-12-13 /pmc/articles/PMC6314628/ /pubmed/30543621 http://dx.doi.org/10.1371/journal.pcbi.1006498 Text en © 2018 Eren et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Eren, Kemal
Weaver, Steven
Ketteringham, Robert
Valentyn, Morné
Laird Smith, Melissa
Kumar, Venkatesh
Mohan, Sanjay
Kosakovsky Pond, Sergei L.
Murrell, Ben
Full-Length Envelope Analyzer (FLEA): A tool for longitudinal analysis of viral amplicons
title Full-Length Envelope Analyzer (FLEA): A tool for longitudinal analysis of viral amplicons
title_full Full-Length Envelope Analyzer (FLEA): A tool for longitudinal analysis of viral amplicons
title_fullStr Full-Length Envelope Analyzer (FLEA): A tool for longitudinal analysis of viral amplicons
title_full_unstemmed Full-Length Envelope Analyzer (FLEA): A tool for longitudinal analysis of viral amplicons
title_short Full-Length Envelope Analyzer (FLEA): A tool for longitudinal analysis of viral amplicons
title_sort full-length envelope analyzer (flea): a tool for longitudinal analysis of viral amplicons
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6314628/
https://www.ncbi.nlm.nih.gov/pubmed/30543621
http://dx.doi.org/10.1371/journal.pcbi.1006498
work_keys_str_mv AT erenkemal fulllengthenvelopeanalyzerfleaatoolforlongitudinalanalysisofviralamplicons
AT weaversteven fulllengthenvelopeanalyzerfleaatoolforlongitudinalanalysisofviralamplicons
AT ketteringhamrobert fulllengthenvelopeanalyzerfleaatoolforlongitudinalanalysisofviralamplicons
AT valentynmorne fulllengthenvelopeanalyzerfleaatoolforlongitudinalanalysisofviralamplicons
AT lairdsmithmelissa fulllengthenvelopeanalyzerfleaatoolforlongitudinalanalysisofviralamplicons
AT kumarvenkatesh fulllengthenvelopeanalyzerfleaatoolforlongitudinalanalysisofviralamplicons
AT mohansanjay fulllengthenvelopeanalyzerfleaatoolforlongitudinalanalysisofviralamplicons
AT kosakovskypondsergeil fulllengthenvelopeanalyzerfleaatoolforlongitudinalanalysisofviralamplicons
AT murrellben fulllengthenvelopeanalyzerfleaatoolforlongitudinalanalysisofviralamplicons