Cargando…

Mining, analyzing, and integrating viral signals from metagenomic data

BACKGROUND: Viruses are important components of microbial communities modulating community structure and function; however, only a couple of tools are currently available for phage identification and analysis from metagenomic sequencing data. Here we employed the random forest algorithm to develop V...

Descripción completa

Detalles Bibliográficos
Autores principales: Zheng, Tingting, Li, Jun, Ni, Yueqiong, Kang, Kang, Misiakou, Maria-Anna, Imamovic, Lejla, Chow, Billy K. C., Rode, Anne A., Bytzer, Peter, Sommer, Morten, Panagiotou, Gianni
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6425642/
https://www.ncbi.nlm.nih.gov/pubmed/30890181
http://dx.doi.org/10.1186/s40168-019-0657-y
_version_ 1783404877224869888
author Zheng, Tingting
Li, Jun
Ni, Yueqiong
Kang, Kang
Misiakou, Maria-Anna
Imamovic, Lejla
Chow, Billy K. C.
Rode, Anne A.
Bytzer, Peter
Sommer, Morten
Panagiotou, Gianni
author_facet Zheng, Tingting
Li, Jun
Ni, Yueqiong
Kang, Kang
Misiakou, Maria-Anna
Imamovic, Lejla
Chow, Billy K. C.
Rode, Anne A.
Bytzer, Peter
Sommer, Morten
Panagiotou, Gianni
author_sort Zheng, Tingting
collection PubMed
description BACKGROUND: Viruses are important components of microbial communities modulating community structure and function; however, only a couple of tools are currently available for phage identification and analysis from metagenomic sequencing data. Here we employed the random forest algorithm to develop VirMiner, a web-based phage contig prediction tool especially sensitive for high-abundances phage contigs, trained and validated by paired metagenomic and phagenomic sequencing data from the human gut flora. RESULTS: VirMiner achieved 41.06% ± 17.51% sensitivity and 81.91% ± 4.04% specificity in the prediction of phage contigs. In particular, for the high-abundance phage contigs, VirMiner outperformed other tools (VirFinder and VirSorter) with much higher sensitivity (65.23% ± 16.94%) than VirFinder (34.63% ± 17.96%) and VirSorter (18.75% ± 15.23%) at almost the same specificity. Moreover, VirMiner provides the most comprehensive phage analysis pipeline which is comprised of metagenomic raw reads processing, functional annotation, phage contig identification, and phage-host relationship prediction (CRISPR-spacer recognition) and supports two-group comparison when the input (metagenomic sequence data) includes different conditions (e.g., case and control). Application of VirMiner to an independent cohort of human gut metagenomes obtained from individuals treated with antibiotics revealed that 122 KEGG orthology and 118 Pfam groups had significantly differential abundance in the pre-treatment samples compared to samples at the end of antibiotic administration, including clustered regularly interspaced short palindromic repeats (CRISPR), multidrug resistance, and protein transport. The VirMiner webserver is available at http://sbb.hku.hk/VirMiner/. CONCLUSIONS: We developed a comprehensive tool for phage prediction and analysis for metagenomic samples. Compared to VirSorter and VirFinder—the most widely used tools—VirMiner is able to capture more high-abundance phage contigs which could play key roles in infecting bacteria and modulating microbial community dynamics. TRIAL REGISTRATION: The European Union Clinical Trials Register, EudraCT Number: 2013-003378-28. Registered on 9 April 2014 ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s40168-019-0657-y) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-6425642
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-64256422019-04-01 Mining, analyzing, and integrating viral signals from metagenomic data Zheng, Tingting Li, Jun Ni, Yueqiong Kang, Kang Misiakou, Maria-Anna Imamovic, Lejla Chow, Billy K. C. Rode, Anne A. Bytzer, Peter Sommer, Morten Panagiotou, Gianni Microbiome Methodology BACKGROUND: Viruses are important components of microbial communities modulating community structure and function; however, only a couple of tools are currently available for phage identification and analysis from metagenomic sequencing data. Here we employed the random forest algorithm to develop VirMiner, a web-based phage contig prediction tool especially sensitive for high-abundances phage contigs, trained and validated by paired metagenomic and phagenomic sequencing data from the human gut flora. RESULTS: VirMiner achieved 41.06% ± 17.51% sensitivity and 81.91% ± 4.04% specificity in the prediction of phage contigs. In particular, for the high-abundance phage contigs, VirMiner outperformed other tools (VirFinder and VirSorter) with much higher sensitivity (65.23% ± 16.94%) than VirFinder (34.63% ± 17.96%) and VirSorter (18.75% ± 15.23%) at almost the same specificity. Moreover, VirMiner provides the most comprehensive phage analysis pipeline which is comprised of metagenomic raw reads processing, functional annotation, phage contig identification, and phage-host relationship prediction (CRISPR-spacer recognition) and supports two-group comparison when the input (metagenomic sequence data) includes different conditions (e.g., case and control). Application of VirMiner to an independent cohort of human gut metagenomes obtained from individuals treated with antibiotics revealed that 122 KEGG orthology and 118 Pfam groups had significantly differential abundance in the pre-treatment samples compared to samples at the end of antibiotic administration, including clustered regularly interspaced short palindromic repeats (CRISPR), multidrug resistance, and protein transport. The VirMiner webserver is available at http://sbb.hku.hk/VirMiner/. CONCLUSIONS: We developed a comprehensive tool for phage prediction and analysis for metagenomic samples. Compared to VirSorter and VirFinder—the most widely used tools—VirMiner is able to capture more high-abundance phage contigs which could play key roles in infecting bacteria and modulating microbial community dynamics. TRIAL REGISTRATION: The European Union Clinical Trials Register, EudraCT Number: 2013-003378-28. Registered on 9 April 2014 ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (10.1186/s40168-019-0657-y) contains supplementary material, which is available to authorized users. BioMed Central 2019-03-19 /pmc/articles/PMC6425642/ /pubmed/30890181 http://dx.doi.org/10.1186/s40168-019-0657-y Text en © The Author(s). 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology
Zheng, Tingting
Li, Jun
Ni, Yueqiong
Kang, Kang
Misiakou, Maria-Anna
Imamovic, Lejla
Chow, Billy K. C.
Rode, Anne A.
Bytzer, Peter
Sommer, Morten
Panagiotou, Gianni
Mining, analyzing, and integrating viral signals from metagenomic data
title Mining, analyzing, and integrating viral signals from metagenomic data
title_full Mining, analyzing, and integrating viral signals from metagenomic data
title_fullStr Mining, analyzing, and integrating viral signals from metagenomic data
title_full_unstemmed Mining, analyzing, and integrating viral signals from metagenomic data
title_short Mining, analyzing, and integrating viral signals from metagenomic data
title_sort mining, analyzing, and integrating viral signals from metagenomic data
topic Methodology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6425642/
https://www.ncbi.nlm.nih.gov/pubmed/30890181
http://dx.doi.org/10.1186/s40168-019-0657-y
work_keys_str_mv AT zhengtingting mininganalyzingandintegratingviralsignalsfrommetagenomicdata
AT lijun mininganalyzingandintegratingviralsignalsfrommetagenomicdata
AT niyueqiong mininganalyzingandintegratingviralsignalsfrommetagenomicdata
AT kangkang mininganalyzingandintegratingviralsignalsfrommetagenomicdata
AT misiakoumariaanna mininganalyzingandintegratingviralsignalsfrommetagenomicdata
AT imamoviclejla mininganalyzingandintegratingviralsignalsfrommetagenomicdata
AT chowbillykc mininganalyzingandintegratingviralsignalsfrommetagenomicdata
AT rodeannea mininganalyzingandintegratingviralsignalsfrommetagenomicdata
AT bytzerpeter mininganalyzingandintegratingviralsignalsfrommetagenomicdata
AT sommermorten mininganalyzingandintegratingviralsignalsfrommetagenomicdata
AT panagiotougianni mininganalyzingandintegratingviralsignalsfrommetagenomicdata