Cargando…

VIRify: An integrated detection, annotation and taxonomic classification pipeline using virus-specific protein profile hidden Markov models

The study of viral communities has revealed the enormous diversity and impact these biological entities have on various ecosystems. These observations have sparked widespread interest in developing computational strategies that support the comprehensive characterisation of viral communities based on...

Descripción completa

Detalles Bibliográficos
Autores principales: Rangel-Pineros, Guillermo, Almeida, Alexandre, Beracochea, Martin, Sakharova, Ekaterina, Marz, Manja, Reyes Muñoz, Alejandro, Hölzer, Martin, Finn, Robert D.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10491390/
https://www.ncbi.nlm.nih.gov/pubmed/37639475
http://dx.doi.org/10.1371/journal.pcbi.1011422
_version_ 1785104049904812032
author Rangel-Pineros, Guillermo
Almeida, Alexandre
Beracochea, Martin
Sakharova, Ekaterina
Marz, Manja
Reyes Muñoz, Alejandro
Hölzer, Martin
Finn, Robert D.
author_facet Rangel-Pineros, Guillermo
Almeida, Alexandre
Beracochea, Martin
Sakharova, Ekaterina
Marz, Manja
Reyes Muñoz, Alejandro
Hölzer, Martin
Finn, Robert D.
author_sort Rangel-Pineros, Guillermo
collection PubMed
description The study of viral communities has revealed the enormous diversity and impact these biological entities have on various ecosystems. These observations have sparked widespread interest in developing computational strategies that support the comprehensive characterisation of viral communities based on sequencing data. Here we introduce VIRify, a new computational pipeline designed to provide a user-friendly and accurate functional and taxonomic characterisation of viral communities. VIRify identifies viral contigs and prophages from metagenomic assemblies and annotates them using a collection of viral profile hidden Markov models (HMMs). These include our manually-curated profile HMMs, which serve as specific taxonomic markers for a wide range of prokaryotic and eukaryotic viral taxa and are thus used to reliably classify viral contigs. We tested VIRify on assemblies from two microbial mock communities, a large metagenomics study, and a collection of publicly available viral genomic sequences from the human gut. The results showed that VIRify could identify sequences from both prokaryotic and eukaryotic viruses, and provided taxonomic classifications from the genus to the family rank with an average accuracy of 86.6%. In addition, VIRify allowed the detection and taxonomic classification of a range of prokaryotic and eukaryotic viruses present in 243 marine metagenomic assemblies. Finally, the use of VIRify led to a large expansion in the number of taxonomically classified human gut viral sequences and the improvement of outdated and shallow taxonomic classifications. Overall, we demonstrate that VIRify is a novel and powerful resource that offers an enhanced capability to detect a broad range of viral contigs and taxonomically classify them.
format Online
Article
Text
id pubmed-10491390
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-104913902023-09-09 VIRify: An integrated detection, annotation and taxonomic classification pipeline using virus-specific protein profile hidden Markov models Rangel-Pineros, Guillermo Almeida, Alexandre Beracochea, Martin Sakharova, Ekaterina Marz, Manja Reyes Muñoz, Alejandro Hölzer, Martin Finn, Robert D. PLoS Comput Biol Research Article The study of viral communities has revealed the enormous diversity and impact these biological entities have on various ecosystems. These observations have sparked widespread interest in developing computational strategies that support the comprehensive characterisation of viral communities based on sequencing data. Here we introduce VIRify, a new computational pipeline designed to provide a user-friendly and accurate functional and taxonomic characterisation of viral communities. VIRify identifies viral contigs and prophages from metagenomic assemblies and annotates them using a collection of viral profile hidden Markov models (HMMs). These include our manually-curated profile HMMs, which serve as specific taxonomic markers for a wide range of prokaryotic and eukaryotic viral taxa and are thus used to reliably classify viral contigs. We tested VIRify on assemblies from two microbial mock communities, a large metagenomics study, and a collection of publicly available viral genomic sequences from the human gut. The results showed that VIRify could identify sequences from both prokaryotic and eukaryotic viruses, and provided taxonomic classifications from the genus to the family rank with an average accuracy of 86.6%. In addition, VIRify allowed the detection and taxonomic classification of a range of prokaryotic and eukaryotic viruses present in 243 marine metagenomic assemblies. Finally, the use of VIRify led to a large expansion in the number of taxonomically classified human gut viral sequences and the improvement of outdated and shallow taxonomic classifications. Overall, we demonstrate that VIRify is a novel and powerful resource that offers an enhanced capability to detect a broad range of viral contigs and taxonomically classify them. Public Library of Science 2023-08-28 /pmc/articles/PMC10491390/ /pubmed/37639475 http://dx.doi.org/10.1371/journal.pcbi.1011422 Text en © 2023 Rangel-Pineros et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Rangel-Pineros, Guillermo
Almeida, Alexandre
Beracochea, Martin
Sakharova, Ekaterina
Marz, Manja
Reyes Muñoz, Alejandro
Hölzer, Martin
Finn, Robert D.
VIRify: An integrated detection, annotation and taxonomic classification pipeline using virus-specific protein profile hidden Markov models
title VIRify: An integrated detection, annotation and taxonomic classification pipeline using virus-specific protein profile hidden Markov models
title_full VIRify: An integrated detection, annotation and taxonomic classification pipeline using virus-specific protein profile hidden Markov models
title_fullStr VIRify: An integrated detection, annotation and taxonomic classification pipeline using virus-specific protein profile hidden Markov models
title_full_unstemmed VIRify: An integrated detection, annotation and taxonomic classification pipeline using virus-specific protein profile hidden Markov models
title_short VIRify: An integrated detection, annotation and taxonomic classification pipeline using virus-specific protein profile hidden Markov models
title_sort virify: an integrated detection, annotation and taxonomic classification pipeline using virus-specific protein profile hidden markov models
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10491390/
https://www.ncbi.nlm.nih.gov/pubmed/37639475
http://dx.doi.org/10.1371/journal.pcbi.1011422
work_keys_str_mv AT rangelpinerosguillermo virifyanintegrateddetectionannotationandtaxonomicclassificationpipelineusingvirusspecificproteinprofilehiddenmarkovmodels
AT almeidaalexandre virifyanintegrateddetectionannotationandtaxonomicclassificationpipelineusingvirusspecificproteinprofilehiddenmarkovmodels
AT beracocheamartin virifyanintegrateddetectionannotationandtaxonomicclassificationpipelineusingvirusspecificproteinprofilehiddenmarkovmodels
AT sakharovaekaterina virifyanintegrateddetectionannotationandtaxonomicclassificationpipelineusingvirusspecificproteinprofilehiddenmarkovmodels
AT marzmanja virifyanintegrateddetectionannotationandtaxonomicclassificationpipelineusingvirusspecificproteinprofilehiddenmarkovmodels
AT reyesmunozalejandro virifyanintegrateddetectionannotationandtaxonomicclassificationpipelineusingvirusspecificproteinprofilehiddenmarkovmodels
AT holzermartin virifyanintegrateddetectionannotationandtaxonomicclassificationpipelineusingvirusspecificproteinprofilehiddenmarkovmodels
AT finnrobertd virifyanintegrateddetectionannotationandtaxonomicclassificationpipelineusingvirusspecificproteinprofilehiddenmarkovmodels