Cargando…

A robust and cost-effective approach to sequence and analyze complete genomes of small RNA viruses

BACKGROUND: Next-generation sequencing (NGS) allows ultra-deep sequencing of nucleic acids. The use of sequence-independent amplification of viral nucleic acids without utilization of target-specific primers provides advantages over traditional sequencing methods and allows detection of unsuspected...

Descripción completa

Detalles Bibliográficos
Autores principales: Dimitrov, Kiril M., Sharma, Poonam, Volkening, Jeremy D., Goraichuk, Iryna V., Wajid, Abdul, Rehmani, Shafqat Fatima, Basharat, Asma, Shittu, Ismaila, Joannis, Tony M., Miller, Patti J., Afonso, Claudio L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5384157/
https://www.ncbi.nlm.nih.gov/pubmed/28388925
http://dx.doi.org/10.1186/s12985-017-0741-5
_version_ 1782520412964388864
author Dimitrov, Kiril M.
Sharma, Poonam
Volkening, Jeremy D.
Goraichuk, Iryna V.
Wajid, Abdul
Rehmani, Shafqat Fatima
Basharat, Asma
Shittu, Ismaila
Joannis, Tony M.
Miller, Patti J.
Afonso, Claudio L.
author_facet Dimitrov, Kiril M.
Sharma, Poonam
Volkening, Jeremy D.
Goraichuk, Iryna V.
Wajid, Abdul
Rehmani, Shafqat Fatima
Basharat, Asma
Shittu, Ismaila
Joannis, Tony M.
Miller, Patti J.
Afonso, Claudio L.
author_sort Dimitrov, Kiril M.
collection PubMed
description BACKGROUND: Next-generation sequencing (NGS) allows ultra-deep sequencing of nucleic acids. The use of sequence-independent amplification of viral nucleic acids without utilization of target-specific primers provides advantages over traditional sequencing methods and allows detection of unsuspected variants and co-infecting agents. However, NGS is not widely used for small RNA viruses because of incorrectly perceived cost estimates and inefficient utilization of freely available bioinformatics tools. METHODS: In this study, we have utilized NGS-based random sequencing of total RNA combined with barcode multiplexing of libraries to quickly, effectively and simultaneously characterize the genomic sequences of multiple avian paramyxoviruses. Thirty libraries were prepared from diagnostic samples amplified in allantoic fluids and their total RNAs were sequenced in a single flow cell on an Illumina MiSeq instrument. After digital normalization, data were assembled using the MIRA assembler within a customized workflow on the Galaxy platform. RESULTS: Twenty-eight avian paramyxovirus 1 (APMV-1), one APMV-13, four avian influenza and two infectious bronchitis virus complete or nearly complete genome sequences were obtained from the single run. The 29 avian paramyxovirus genomes displayed 99.6% mean coverage based on bases with Phred quality scores of 30 or more. The lower and upper quartiles of sample median depth per position for those 29 samples were 2984 and 6894, respectively, indicating coverage across samples sufficient for deep variant analysis. Sample processing and library preparation took approximately 25–30 h, the sequencing run took 39 h, and processing through the Galaxy workflow took approximately 2–3 h. The cost of all steps, excluding labor, was estimated to be 106 USD per sample. CONCLUSIONS: This work describes an efficient multiplexing NGS approach, a detailed analysis workflow, and customized tools for the characterization of the genomes of RNA viruses. The combination of multiplexing NGS technology with the Galaxy workflow platform resulted in a fast, user-friendly, and cost-efficient protocol for the simultaneous characterization of multiple full-length viral genomes. Twenty-nine full-length or near-full-length APMV genomes with a high median depth were successfully sequenced out of 30 samples. The applied de novo assembly approach also allowed identification of mixed viral populations in some of the samples. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12985-017-0741-5) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-5384157
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-53841572017-04-12 A robust and cost-effective approach to sequence and analyze complete genomes of small RNA viruses Dimitrov, Kiril M. Sharma, Poonam Volkening, Jeremy D. Goraichuk, Iryna V. Wajid, Abdul Rehmani, Shafqat Fatima Basharat, Asma Shittu, Ismaila Joannis, Tony M. Miller, Patti J. Afonso, Claudio L. Virol J Methodology BACKGROUND: Next-generation sequencing (NGS) allows ultra-deep sequencing of nucleic acids. The use of sequence-independent amplification of viral nucleic acids without utilization of target-specific primers provides advantages over traditional sequencing methods and allows detection of unsuspected variants and co-infecting agents. However, NGS is not widely used for small RNA viruses because of incorrectly perceived cost estimates and inefficient utilization of freely available bioinformatics tools. METHODS: In this study, we have utilized NGS-based random sequencing of total RNA combined with barcode multiplexing of libraries to quickly, effectively and simultaneously characterize the genomic sequences of multiple avian paramyxoviruses. Thirty libraries were prepared from diagnostic samples amplified in allantoic fluids and their total RNAs were sequenced in a single flow cell on an Illumina MiSeq instrument. After digital normalization, data were assembled using the MIRA assembler within a customized workflow on the Galaxy platform. RESULTS: Twenty-eight avian paramyxovirus 1 (APMV-1), one APMV-13, four avian influenza and two infectious bronchitis virus complete or nearly complete genome sequences were obtained from the single run. The 29 avian paramyxovirus genomes displayed 99.6% mean coverage based on bases with Phred quality scores of 30 or more. The lower and upper quartiles of sample median depth per position for those 29 samples were 2984 and 6894, respectively, indicating coverage across samples sufficient for deep variant analysis. Sample processing and library preparation took approximately 25–30 h, the sequencing run took 39 h, and processing through the Galaxy workflow took approximately 2–3 h. The cost of all steps, excluding labor, was estimated to be 106 USD per sample. CONCLUSIONS: This work describes an efficient multiplexing NGS approach, a detailed analysis workflow, and customized tools for the characterization of the genomes of RNA viruses. The combination of multiplexing NGS technology with the Galaxy workflow platform resulted in a fast, user-friendly, and cost-efficient protocol for the simultaneous characterization of multiple full-length viral genomes. Twenty-nine full-length or near-full-length APMV genomes with a high median depth were successfully sequenced out of 30 samples. The applied de novo assembly approach also allowed identification of mixed viral populations in some of the samples. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12985-017-0741-5) contains supplementary material, which is available to authorized users. BioMed Central 2017-04-07 /pmc/articles/PMC5384157/ /pubmed/28388925 http://dx.doi.org/10.1186/s12985-017-0741-5 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology
Dimitrov, Kiril M.
Sharma, Poonam
Volkening, Jeremy D.
Goraichuk, Iryna V.
Wajid, Abdul
Rehmani, Shafqat Fatima
Basharat, Asma
Shittu, Ismaila
Joannis, Tony M.
Miller, Patti J.
Afonso, Claudio L.
A robust and cost-effective approach to sequence and analyze complete genomes of small RNA viruses
title A robust and cost-effective approach to sequence and analyze complete genomes of small RNA viruses
title_full A robust and cost-effective approach to sequence and analyze complete genomes of small RNA viruses
title_fullStr A robust and cost-effective approach to sequence and analyze complete genomes of small RNA viruses
title_full_unstemmed A robust and cost-effective approach to sequence and analyze complete genomes of small RNA viruses
title_short A robust and cost-effective approach to sequence and analyze complete genomes of small RNA viruses
title_sort robust and cost-effective approach to sequence and analyze complete genomes of small rna viruses
topic Methodology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5384157/
https://www.ncbi.nlm.nih.gov/pubmed/28388925
http://dx.doi.org/10.1186/s12985-017-0741-5
work_keys_str_mv AT dimitrovkirilm arobustandcosteffectiveapproachtosequenceandanalyzecompletegenomesofsmallrnaviruses
AT sharmapoonam arobustandcosteffectiveapproachtosequenceandanalyzecompletegenomesofsmallrnaviruses
AT volkeningjeremyd arobustandcosteffectiveapproachtosequenceandanalyzecompletegenomesofsmallrnaviruses
AT goraichukirynav arobustandcosteffectiveapproachtosequenceandanalyzecompletegenomesofsmallrnaviruses
AT wajidabdul arobustandcosteffectiveapproachtosequenceandanalyzecompletegenomesofsmallrnaviruses
AT rehmanishafqatfatima arobustandcosteffectiveapproachtosequenceandanalyzecompletegenomesofsmallrnaviruses
AT basharatasma arobustandcosteffectiveapproachtosequenceandanalyzecompletegenomesofsmallrnaviruses
AT shittuismaila arobustandcosteffectiveapproachtosequenceandanalyzecompletegenomesofsmallrnaviruses
AT joannistonym arobustandcosteffectiveapproachtosequenceandanalyzecompletegenomesofsmallrnaviruses
AT millerpattij arobustandcosteffectiveapproachtosequenceandanalyzecompletegenomesofsmallrnaviruses
AT afonsoclaudiol arobustandcosteffectiveapproachtosequenceandanalyzecompletegenomesofsmallrnaviruses
AT dimitrovkirilm robustandcosteffectiveapproachtosequenceandanalyzecompletegenomesofsmallrnaviruses
AT sharmapoonam robustandcosteffectiveapproachtosequenceandanalyzecompletegenomesofsmallrnaviruses
AT volkeningjeremyd robustandcosteffectiveapproachtosequenceandanalyzecompletegenomesofsmallrnaviruses
AT goraichukirynav robustandcosteffectiveapproachtosequenceandanalyzecompletegenomesofsmallrnaviruses
AT wajidabdul robustandcosteffectiveapproachtosequenceandanalyzecompletegenomesofsmallrnaviruses
AT rehmanishafqatfatima robustandcosteffectiveapproachtosequenceandanalyzecompletegenomesofsmallrnaviruses
AT basharatasma robustandcosteffectiveapproachtosequenceandanalyzecompletegenomesofsmallrnaviruses
AT shittuismaila robustandcosteffectiveapproachtosequenceandanalyzecompletegenomesofsmallrnaviruses
AT joannistonym robustandcosteffectiveapproachtosequenceandanalyzecompletegenomesofsmallrnaviruses
AT millerpattij robustandcosteffectiveapproachtosequenceandanalyzecompletegenomesofsmallrnaviruses
AT afonsoclaudiol robustandcosteffectiveapproachtosequenceandanalyzecompletegenomesofsmallrnaviruses