Cargando…

ENVirT: inference of ecological characteristics of viruses from metagenomic data

BACKGROUND: Estimating the parameters that describe the ecology of viruses,particularly those that are novel, can be made possible using metagenomic approaches. However, the best-performing existing methods require databases to first estimate an average genome length of a viral community before bein...

Descripción completa

Detalles Bibliográficos
Autores principales: Jayasundara, Duleepa, Herath, Damayanthi, Senanayake, Damith, Saeed, Isaam, Yang, Cheng-Yu, Sun, Yuan, Chang, Bill C., Tang, Sen-Lin, Halgamuge, Saman K.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7394321/
https://www.ncbi.nlm.nih.gov/pubmed/30717665
http://dx.doi.org/10.1186/s12859-018-2398-5
_version_ 1783565208630853632
author Jayasundara, Duleepa
Herath, Damayanthi
Senanayake, Damith
Saeed, Isaam
Yang, Cheng-Yu
Sun, Yuan
Chang, Bill C.
Tang, Sen-Lin
Halgamuge, Saman K.
author_facet Jayasundara, Duleepa
Herath, Damayanthi
Senanayake, Damith
Saeed, Isaam
Yang, Cheng-Yu
Sun, Yuan
Chang, Bill C.
Tang, Sen-Lin
Halgamuge, Saman K.
author_sort Jayasundara, Duleepa
collection PubMed
description BACKGROUND: Estimating the parameters that describe the ecology of viruses,particularly those that are novel, can be made possible using metagenomic approaches. However, the best-performing existing methods require databases to first estimate an average genome length of a viral community before being able to estimate other parameters, such as viral richness. Although this approach has been widely used, it can adversely skew results since the majority of viruses are yet to be catalogued in databases. RESULTS: In this paper, we present ENVirT, a method for estimating the richness of novel viral mixtures, and for the first time we also show that it is possible to simultaneously estimate the average genome length without a priori information. This is shown to be a significant improvement over database-dependent methods, since we can now robustly analyze samples that may include novel viral types under-represented in current databases. We demonstrate that the viral richness estimates produced by ENVirT are several orders of magnitude higher in accuracy than the estimates produced by existing methods named PHACCS and CatchAll when benchmarked against simulated data. We repeated the analysis of 20 metavirome samples using ENVirT, which produced results in close agreement with complementary in virto analyses. CONCLUSIONS: These insights were previously not captured by existing computational methods. As such, ENVirT is shown to be an essential tool for enhancing our understanding of novel viral populations. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2398-5) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-7394321
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-73943212020-08-05 ENVirT: inference of ecological characteristics of viruses from metagenomic data Jayasundara, Duleepa Herath, Damayanthi Senanayake, Damith Saeed, Isaam Yang, Cheng-Yu Sun, Yuan Chang, Bill C. Tang, Sen-Lin Halgamuge, Saman K. BMC Bioinformatics Research BACKGROUND: Estimating the parameters that describe the ecology of viruses,particularly those that are novel, can be made possible using metagenomic approaches. However, the best-performing existing methods require databases to first estimate an average genome length of a viral community before being able to estimate other parameters, such as viral richness. Although this approach has been widely used, it can adversely skew results since the majority of viruses are yet to be catalogued in databases. RESULTS: In this paper, we present ENVirT, a method for estimating the richness of novel viral mixtures, and for the first time we also show that it is possible to simultaneously estimate the average genome length without a priori information. This is shown to be a significant improvement over database-dependent methods, since we can now robustly analyze samples that may include novel viral types under-represented in current databases. We demonstrate that the viral richness estimates produced by ENVirT are several orders of magnitude higher in accuracy than the estimates produced by existing methods named PHACCS and CatchAll when benchmarked against simulated data. We repeated the analysis of 20 metavirome samples using ENVirT, which produced results in close agreement with complementary in virto analyses. CONCLUSIONS: These insights were previously not captured by existing computational methods. As such, ENVirT is shown to be an essential tool for enhancing our understanding of novel viral populations. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s12859-018-2398-5) contains supplementary material, which is available to authorized users. BioMed Central 2019-02-04 /pmc/articles/PMC7394321/ /pubmed/30717665 http://dx.doi.org/10.1186/s12859-018-2398-5 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Jayasundara, Duleepa
Herath, Damayanthi
Senanayake, Damith
Saeed, Isaam
Yang, Cheng-Yu
Sun, Yuan
Chang, Bill C.
Tang, Sen-Lin
Halgamuge, Saman K.
ENVirT: inference of ecological characteristics of viruses from metagenomic data
title ENVirT: inference of ecological characteristics of viruses from metagenomic data
title_full ENVirT: inference of ecological characteristics of viruses from metagenomic data
title_fullStr ENVirT: inference of ecological characteristics of viruses from metagenomic data
title_full_unstemmed ENVirT: inference of ecological characteristics of viruses from metagenomic data
title_short ENVirT: inference of ecological characteristics of viruses from metagenomic data
title_sort envirt: inference of ecological characteristics of viruses from metagenomic data
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7394321/
https://www.ncbi.nlm.nih.gov/pubmed/30717665
http://dx.doi.org/10.1186/s12859-018-2398-5
work_keys_str_mv AT jayasundaraduleepa envirtinferenceofecologicalcharacteristicsofvirusesfrommetagenomicdata
AT herathdamayanthi envirtinferenceofecologicalcharacteristicsofvirusesfrommetagenomicdata
AT senanayakedamith envirtinferenceofecologicalcharacteristicsofvirusesfrommetagenomicdata
AT saeedisaam envirtinferenceofecologicalcharacteristicsofvirusesfrommetagenomicdata
AT yangchengyu envirtinferenceofecologicalcharacteristicsofvirusesfrommetagenomicdata
AT sunyuan envirtinferenceofecologicalcharacteristicsofvirusesfrommetagenomicdata
AT changbillc envirtinferenceofecologicalcharacteristicsofvirusesfrommetagenomicdata
AT tangsenlin envirtinferenceofecologicalcharacteristicsofvirusesfrommetagenomicdata
AT halgamugesamank envirtinferenceofecologicalcharacteristicsofvirusesfrommetagenomicdata