Cargando…

Viral deep sequencing needs an adaptive approach: IRMA, the iterative refinement meta-assembler

BACKGROUND: Deep sequencing makes it possible to observe low-frequency viral variants and sub-populations with greater accuracy and sensitivity than ever before. Existing platforms can be used to multiplex a large number of samples; however, analysis of the resulting data is complex and involves sep...

Descripción completa

Detalles Bibliográficos
Autores principales: Shepard, Samuel S., Meno, Sarah, Bahl, Justin, Wilson, Malania M., Barnes, John, Neuhaus, Elizabeth
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5011931/
https://www.ncbi.nlm.nih.gov/pubmed/27595578
http://dx.doi.org/10.1186/s12864-016-3030-6
_version_ 1782451923763331072
author Shepard, Samuel S.
Meno, Sarah
Bahl, Justin
Wilson, Malania M.
Barnes, John
Neuhaus, Elizabeth
author_facet Shepard, Samuel S.
Meno, Sarah
Bahl, Justin
Wilson, Malania M.
Barnes, John
Neuhaus, Elizabeth
author_sort Shepard, Samuel S.
collection PubMed
description BACKGROUND: Deep sequencing makes it possible to observe low-frequency viral variants and sub-populations with greater accuracy and sensitivity than ever before. Existing platforms can be used to multiplex a large number of samples; however, analysis of the resulting data is complex and involves separating barcoded samples and various read manipulation processes ending in final assembly. Many assembly tools were designed with larger genomes and higher fidelity polymerases in mind and do not perform well with reads derived from highly variable viral genomes. Reference-based assemblers may leave gaps in viral assemblies while de novo assemblers may struggle to assemble unique genomes. RESULTS: The IRMA (iterative refinement meta-assembler) pipeline solves the problem of viral variation by the iterative optimization of read gathering and assembly. As with all reference-based assembly, reads are included in assembly when they match consensus template sets; however, IRMA provides for on-the-fly reference editing, correction, and optional elongation without the need for additional reference selection. This increases both read depth and breadth. IRMA also focuses on quality control, error correction, indel reporting, variant calling and variant phasing. In fact, IRMA’s ability to detect and phase minor variants is one of its most distinguishing features. We have built modules for influenza and ebolavirus. We demonstrate usage and provide calibration data from mixture experiments. Methods for variant calling, phasing, and error estimation/correction have been redesigned to meet the needs of viral genomic sequencing. CONCLUSION: IRMA provides a robust next-generation sequencing assembly solution that is adapted to the needs and characteristics of viral genomes. The software solves issues related to the genetic diversity of viruses while providing customized variant calling, phasing, and quality control. IRMA is freely available for non-commercial use on Linux and Mac OS X and has been parallelized for high-throughput computing. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-016-3030-6) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5011931
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-50119312016-09-07 Viral deep sequencing needs an adaptive approach: IRMA, the iterative refinement meta-assembler Shepard, Samuel S. Meno, Sarah Bahl, Justin Wilson, Malania M. Barnes, John Neuhaus, Elizabeth BMC Genomics Methodology Article BACKGROUND: Deep sequencing makes it possible to observe low-frequency viral variants and sub-populations with greater accuracy and sensitivity than ever before. Existing platforms can be used to multiplex a large number of samples; however, analysis of the resulting data is complex and involves separating barcoded samples and various read manipulation processes ending in final assembly. Many assembly tools were designed with larger genomes and higher fidelity polymerases in mind and do not perform well with reads derived from highly variable viral genomes. Reference-based assemblers may leave gaps in viral assemblies while de novo assemblers may struggle to assemble unique genomes. RESULTS: The IRMA (iterative refinement meta-assembler) pipeline solves the problem of viral variation by the iterative optimization of read gathering and assembly. As with all reference-based assembly, reads are included in assembly when they match consensus template sets; however, IRMA provides for on-the-fly reference editing, correction, and optional elongation without the need for additional reference selection. This increases both read depth and breadth. IRMA also focuses on quality control, error correction, indel reporting, variant calling and variant phasing. In fact, IRMA’s ability to detect and phase minor variants is one of its most distinguishing features. We have built modules for influenza and ebolavirus. We demonstrate usage and provide calibration data from mixture experiments. Methods for variant calling, phasing, and error estimation/correction have been redesigned to meet the needs of viral genomic sequencing. CONCLUSION: IRMA provides a robust next-generation sequencing assembly solution that is adapted to the needs and characteristics of viral genomes. The software solves issues related to the genetic diversity of viruses while providing customized variant calling, phasing, and quality control. IRMA is freely available for non-commercial use on Linux and Mac OS X and has been parallelized for high-throughput computing. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-016-3030-6) contains supplementary material, which is available to authorized users. BioMed Central 2016-09-05 /pmc/articles/PMC5011931/ /pubmed/27595578 http://dx.doi.org/10.1186/s12864-016-3030-6 Text en © The Author(s). 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Shepard, Samuel S.
Meno, Sarah
Bahl, Justin
Wilson, Malania M.
Barnes, John
Neuhaus, Elizabeth
Viral deep sequencing needs an adaptive approach: IRMA, the iterative refinement meta-assembler
title Viral deep sequencing needs an adaptive approach: IRMA, the iterative refinement meta-assembler
title_full Viral deep sequencing needs an adaptive approach: IRMA, the iterative refinement meta-assembler
title_fullStr Viral deep sequencing needs an adaptive approach: IRMA, the iterative refinement meta-assembler
title_full_unstemmed Viral deep sequencing needs an adaptive approach: IRMA, the iterative refinement meta-assembler
title_short Viral deep sequencing needs an adaptive approach: IRMA, the iterative refinement meta-assembler
title_sort viral deep sequencing needs an adaptive approach: irma, the iterative refinement meta-assembler
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5011931/
https://www.ncbi.nlm.nih.gov/pubmed/27595578
http://dx.doi.org/10.1186/s12864-016-3030-6
work_keys_str_mv AT shepardsamuels viraldeepsequencingneedsanadaptiveapproachirmatheiterativerefinementmetaassembler
AT menosarah viraldeepsequencingneedsanadaptiveapproachirmatheiterativerefinementmetaassembler
AT bahljustin viraldeepsequencingneedsanadaptiveapproachirmatheiterativerefinementmetaassembler
AT wilsonmalaniam viraldeepsequencingneedsanadaptiveapproachirmatheiterativerefinementmetaassembler
AT barnesjohn viraldeepsequencingneedsanadaptiveapproachirmatheiterativerefinementmetaassembler
AT neuhauselizabeth viraldeepsequencingneedsanadaptiveapproachirmatheiterativerefinementmetaassembler