Cargando…
VIBRANT: automated recovery, annotation and curation of microbial viruses, and evaluation of viral community function from genomic sequences
BACKGROUND: Viruses are central to microbial community structure in all environments. The ability to generate large metagenomic assemblies of mixed microbial and viral sequences provides the opportunity to tease apart complex microbiome dynamics, but these analyses are currently limited by the tools...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7288430/ https://www.ncbi.nlm.nih.gov/pubmed/32522236 http://dx.doi.org/10.1186/s40168-020-00867-0 |
_version_ | 1783545274762788864 |
---|---|
author | Kieft, Kristopher Zhou, Zhichao Anantharaman, Karthik |
author_facet | Kieft, Kristopher Zhou, Zhichao Anantharaman, Karthik |
author_sort | Kieft, Kristopher |
collection | PubMed |
description | BACKGROUND: Viruses are central to microbial community structure in all environments. The ability to generate large metagenomic assemblies of mixed microbial and viral sequences provides the opportunity to tease apart complex microbiome dynamics, but these analyses are currently limited by the tools available for analyses of viral genomes and assessing their metabolic impacts on microbiomes. DESIGN: Here we present VIBRANT, the first method to utilize a hybrid machine learning and protein similarity approach that is not reliant on sequence features for automated recovery and annotation of viruses, determination of genome quality and completeness, and characterization of viral community function from metagenomic assemblies. VIBRANT uses neural networks of protein signatures and a newly developed v-score metric that circumvents traditional boundaries to maximize identification of lytic viral genomes and integrated proviruses, including highly diverse viruses. VIBRANT highlights viral auxiliary metabolic genes and metabolic pathways, thereby serving as a user-friendly platform for evaluating viral community function. VIBRANT was trained and validated on reference virus datasets as well as microbiome and virome data. RESULTS: VIBRANT showed superior performance in recovering higher quality viruses and concurrently reduced the false identification of non-viral genome fragments in comparison to other virus identification programs, specifically VirSorter, VirFinder, and MARVEL. When applied to 120,834 metagenome-derived viral sequences representing several human and natural environments, VIBRANT recovered an average of 94% of the viruses, whereas VirFinder, VirSorter, and MARVEL achieved less powerful performance, averaging 48%, 87%, and 71%, respectively. Similarly, VIBRANT identified more total viral sequence and proteins when applied to real metagenomes. When compared to PHASTER, Prophage Hunter, and VirSorter for the ability to extract integrated provirus regions from host scaffolds, VIBRANT performed comparably and even identified proviruses that the other programs did not. To demonstrate applications of VIBRANT, we studied viromes associated with Crohn’s disease to show that specific viral groups, namely Enterobacteriales-like viruses, as well as putative dysbiosis associated viral proteins are more abundant compared to healthy individuals, providing a possible viral link to maintenance of diseased states. CONCLUSIONS: The ability to accurately recover viruses and explore viral impacts on microbial community metabolism will greatly advance our understanding of microbiomes, host-microbe interactions, and ecosystem dynamics. |
format | Online Article Text |
id | pubmed-7288430 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-72884302020-06-11 VIBRANT: automated recovery, annotation and curation of microbial viruses, and evaluation of viral community function from genomic sequences Kieft, Kristopher Zhou, Zhichao Anantharaman, Karthik Microbiome Methodology BACKGROUND: Viruses are central to microbial community structure in all environments. The ability to generate large metagenomic assemblies of mixed microbial and viral sequences provides the opportunity to tease apart complex microbiome dynamics, but these analyses are currently limited by the tools available for analyses of viral genomes and assessing their metabolic impacts on microbiomes. DESIGN: Here we present VIBRANT, the first method to utilize a hybrid machine learning and protein similarity approach that is not reliant on sequence features for automated recovery and annotation of viruses, determination of genome quality and completeness, and characterization of viral community function from metagenomic assemblies. VIBRANT uses neural networks of protein signatures and a newly developed v-score metric that circumvents traditional boundaries to maximize identification of lytic viral genomes and integrated proviruses, including highly diverse viruses. VIBRANT highlights viral auxiliary metabolic genes and metabolic pathways, thereby serving as a user-friendly platform for evaluating viral community function. VIBRANT was trained and validated on reference virus datasets as well as microbiome and virome data. RESULTS: VIBRANT showed superior performance in recovering higher quality viruses and concurrently reduced the false identification of non-viral genome fragments in comparison to other virus identification programs, specifically VirSorter, VirFinder, and MARVEL. When applied to 120,834 metagenome-derived viral sequences representing several human and natural environments, VIBRANT recovered an average of 94% of the viruses, whereas VirFinder, VirSorter, and MARVEL achieved less powerful performance, averaging 48%, 87%, and 71%, respectively. Similarly, VIBRANT identified more total viral sequence and proteins when applied to real metagenomes. When compared to PHASTER, Prophage Hunter, and VirSorter for the ability to extract integrated provirus regions from host scaffolds, VIBRANT performed comparably and even identified proviruses that the other programs did not. To demonstrate applications of VIBRANT, we studied viromes associated with Crohn’s disease to show that specific viral groups, namely Enterobacteriales-like viruses, as well as putative dysbiosis associated viral proteins are more abundant compared to healthy individuals, providing a possible viral link to maintenance of diseased states. CONCLUSIONS: The ability to accurately recover viruses and explore viral impacts on microbial community metabolism will greatly advance our understanding of microbiomes, host-microbe interactions, and ecosystem dynamics. BioMed Central 2020-06-10 /pmc/articles/PMC7288430/ /pubmed/32522236 http://dx.doi.org/10.1186/s40168-020-00867-0 Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Methodology Kieft, Kristopher Zhou, Zhichao Anantharaman, Karthik VIBRANT: automated recovery, annotation and curation of microbial viruses, and evaluation of viral community function from genomic sequences |
title | VIBRANT: automated recovery, annotation and curation of microbial viruses, and evaluation of viral community function from genomic sequences |
title_full | VIBRANT: automated recovery, annotation and curation of microbial viruses, and evaluation of viral community function from genomic sequences |
title_fullStr | VIBRANT: automated recovery, annotation and curation of microbial viruses, and evaluation of viral community function from genomic sequences |
title_full_unstemmed | VIBRANT: automated recovery, annotation and curation of microbial viruses, and evaluation of viral community function from genomic sequences |
title_short | VIBRANT: automated recovery, annotation and curation of microbial viruses, and evaluation of viral community function from genomic sequences |
title_sort | vibrant: automated recovery, annotation and curation of microbial viruses, and evaluation of viral community function from genomic sequences |
topic | Methodology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7288430/ https://www.ncbi.nlm.nih.gov/pubmed/32522236 http://dx.doi.org/10.1186/s40168-020-00867-0 |
work_keys_str_mv | AT kieftkristopher vibrantautomatedrecoveryannotationandcurationofmicrobialvirusesandevaluationofviralcommunityfunctionfromgenomicsequences AT zhouzhichao vibrantautomatedrecoveryannotationandcurationofmicrobialvirusesandevaluationofviralcommunityfunctionfromgenomicsequences AT anantharamankarthik vibrantautomatedrecoveryannotationandcurationofmicrobialvirusesandevaluationofviralcommunityfunctionfromgenomicsequences |