Cargando…
Accurate reconstruction of viral quasispecies spectra through improved estimation of strain richness
BACKGROUND: Estimating the number of different species (richness) in a mixed microbial population has been a main focus in metagenomic research. Existing methods of species richness estimation ride on the assumption that the reads in each assembled contig correspond to only one of the microbial geno...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4682401/ https://www.ncbi.nlm.nih.gov/pubmed/26678073 http://dx.doi.org/10.1186/1471-2105-16-S18-S3 |
_version_ | 1782405883025686528 |
---|---|
author | Jayasundara, Duleepa Saeed, I Chang, BC Tang, Sen-Lin Halgamuge, Saman K |
author_facet | Jayasundara, Duleepa Saeed, I Chang, BC Tang, Sen-Lin Halgamuge, Saman K |
author_sort | Jayasundara, Duleepa |
collection | PubMed |
description | BACKGROUND: Estimating the number of different species (richness) in a mixed microbial population has been a main focus in metagenomic research. Existing methods of species richness estimation ride on the assumption that the reads in each assembled contig correspond to only one of the microbial genomes in the population. This assumption and the underlying probabilistic formulations of existing methods are not useful for quasispecies populations where the strains are highly genetically related. The lack of knowledge on the number of different strains in a quasispecies population is observed to hinder the precision of existing Viral Quasispecies Spectrum Reconstruction (QSR) methods due to the uncontrolled reconstruction of a large number of in silico false positives. In this work, we formulated a novel probabilistic method for strain richness estimation specifically targeting viral quasispecies. By using this approach we improved our recently proposed spectrum reconstruction pipeline ViQuaS to achieve higher levels of precision in reconstructed quasispecies spectra without compromising the recall rates. We also discuss how one other existing popular QSR method named ShoRAH can be improved using this new approach. RESULTS: On benchmark data sets, our estimation method provided accurate richness estimates (< 0.2 median estimation error) and improved the precision of ViQuaS by 2%-13% and F-score by 1%-9% without compromising the recall rates. We also demonstrate that our estimation method can be used to improve the precision and F-score of ShoRAH by 0%-7% and 0%-5% respectively. CONCLUSIONS: The proposed probabilistic estimation method can be used to estimate the richness of viral populations with a quasispecies behavior and to improve the accuracy of the quasispecies spectra reconstructed by the existing methods ViQuaS and ShoRAH in the presence of a moderate level of technical sequencing errors. AVAILABILITY: http://sourceforge.net/projects/viquas/ |
format | Online Article Text |
id | pubmed-4682401 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-46824012015-12-21 Accurate reconstruction of viral quasispecies spectra through improved estimation of strain richness Jayasundara, Duleepa Saeed, I Chang, BC Tang, Sen-Lin Halgamuge, Saman K BMC Bioinformatics Research BACKGROUND: Estimating the number of different species (richness) in a mixed microbial population has been a main focus in metagenomic research. Existing methods of species richness estimation ride on the assumption that the reads in each assembled contig correspond to only one of the microbial genomes in the population. This assumption and the underlying probabilistic formulations of existing methods are not useful for quasispecies populations where the strains are highly genetically related. The lack of knowledge on the number of different strains in a quasispecies population is observed to hinder the precision of existing Viral Quasispecies Spectrum Reconstruction (QSR) methods due to the uncontrolled reconstruction of a large number of in silico false positives. In this work, we formulated a novel probabilistic method for strain richness estimation specifically targeting viral quasispecies. By using this approach we improved our recently proposed spectrum reconstruction pipeline ViQuaS to achieve higher levels of precision in reconstructed quasispecies spectra without compromising the recall rates. We also discuss how one other existing popular QSR method named ShoRAH can be improved using this new approach. RESULTS: On benchmark data sets, our estimation method provided accurate richness estimates (< 0.2 median estimation error) and improved the precision of ViQuaS by 2%-13% and F-score by 1%-9% without compromising the recall rates. We also demonstrate that our estimation method can be used to improve the precision and F-score of ShoRAH by 0%-7% and 0%-5% respectively. CONCLUSIONS: The proposed probabilistic estimation method can be used to estimate the richness of viral populations with a quasispecies behavior and to improve the accuracy of the quasispecies spectra reconstructed by the existing methods ViQuaS and ShoRAH in the presence of a moderate level of technical sequencing errors. AVAILABILITY: http://sourceforge.net/projects/viquas/ BioMed Central 2015-12-09 /pmc/articles/PMC4682401/ /pubmed/26678073 http://dx.doi.org/10.1186/1471-2105-16-S18-S3 Text en Copyright © 2015 Jayasundara et al.; http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Jayasundara, Duleepa Saeed, I Chang, BC Tang, Sen-Lin Halgamuge, Saman K Accurate reconstruction of viral quasispecies spectra through improved estimation of strain richness |
title | Accurate reconstruction of viral quasispecies spectra through improved estimation of strain richness |
title_full | Accurate reconstruction of viral quasispecies spectra through improved estimation of strain richness |
title_fullStr | Accurate reconstruction of viral quasispecies spectra through improved estimation of strain richness |
title_full_unstemmed | Accurate reconstruction of viral quasispecies spectra through improved estimation of strain richness |
title_short | Accurate reconstruction of viral quasispecies spectra through improved estimation of strain richness |
title_sort | accurate reconstruction of viral quasispecies spectra through improved estimation of strain richness |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4682401/ https://www.ncbi.nlm.nih.gov/pubmed/26678073 http://dx.doi.org/10.1186/1471-2105-16-S18-S3 |
work_keys_str_mv | AT jayasundaraduleepa accuratereconstructionofviralquasispeciesspectrathroughimprovedestimationofstrainrichness AT saeedi accuratereconstructionofviralquasispeciesspectrathroughimprovedestimationofstrainrichness AT changbc accuratereconstructionofviralquasispeciesspectrathroughimprovedestimationofstrainrichness AT tangsenlin accuratereconstructionofviralquasispeciesspectrathroughimprovedestimationofstrainrichness AT halgamugesamank accuratereconstructionofviralquasispeciesspectrathroughimprovedestimationofstrainrichness |