Cargando…

TaxiBGC: a Taxonomy-Guided Approach for Profiling Experimentally Characterized Microbial Biosynthetic Gene Clusters and Secondary Metabolite Production Potential in Metagenomes

Biosynthetic gene clusters (BGCs) in microbial genomes encode bioactive secondary metabolites (SMs), which can play important roles in microbe-microbe and host-microbe interactions. Given the biological significance of SMs and the current profound interest in the metabolic functions of microbiomes,...

Descripción completa

Detalles Bibliográficos
Autores principales: Gupta, Vinod K., Bakshi, Utpal, Chang, Daniel, Lee, Aileen R., Davis, John M., Chandrasekaran, Sriram, Jin, Yong-Su, Freeman, Michael F., Sung, Jaeyun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Society for Microbiology 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9765181/
https://www.ncbi.nlm.nih.gov/pubmed/36378489
http://dx.doi.org/10.1128/msystems.00925-22
_version_ 1784853428967571456
author Gupta, Vinod K.
Bakshi, Utpal
Chang, Daniel
Lee, Aileen R.
Davis, John M.
Chandrasekaran, Sriram
Jin, Yong-Su
Freeman, Michael F.
Sung, Jaeyun
author_facet Gupta, Vinod K.
Bakshi, Utpal
Chang, Daniel
Lee, Aileen R.
Davis, John M.
Chandrasekaran, Sriram
Jin, Yong-Su
Freeman, Michael F.
Sung, Jaeyun
author_sort Gupta, Vinod K.
collection PubMed
description Biosynthetic gene clusters (BGCs) in microbial genomes encode bioactive secondary metabolites (SMs), which can play important roles in microbe-microbe and host-microbe interactions. Given the biological significance of SMs and the current profound interest in the metabolic functions of microbiomes, the unbiased identification of BGCs from high-throughput metagenomic data could offer novel insights into the complex chemical ecology of microbial communities. Currently available tools for predicting BGCs from shotgun metagenomes have several limitations, including the need for computationally demanding read assembly, predicting a narrow breadth of BGC classes, and not providing the SM product. To overcome these limitations, we developed taxonomy-guided identification of biosynthetic gene clusters (TaxiBGC), a command-line tool for predicting experimentally characterized BGCs (and inferring their known SMs) in metagenomes by first pinpointing the microbial species likely to harbor them. We benchmarked TaxiBGC on various simulated metagenomes, showing that our taxonomy-guided approach could predict BGCs with much-improved performance (mean F(1) score, 0.56; mean PPV score, 0.80) compared with directly identifying BGCs by mapping sequencing reads onto the BGC genes (mean F(1) score, 0.49; mean PPV score, 0.41). Next, by applying TaxiBGC on 2,650 metagenomes from the Human Microbiome Project and various case-control gut microbiome studies, we were able to associate BGCs (and their SMs) with different human body sites and with multiple diseases, including Crohn’s disease and liver cirrhosis. In all, TaxiBGC provides an in silico platform to predict experimentally characterized BGCs and their SM production potential in metagenomic data while demonstrating important advantages over existing techniques. IMPORTANCE Currently available bioinformatics tools to identify BGCs from metagenomic sequencing data are limited in their predictive capability or ease of use to even computationally oriented researchers. We present an automated computational pipeline called TaxiBGC, which predicts experimentally characterized BGCs (and infers their known SMs) in shotgun metagenomes by first considering the microbial species source. Through rigorous benchmarking techniques on simulated metagenomes, we show that TaxiBGC provides a significant advantage over existing methods. When demonstrating TaxiBGC on thousands of human microbiome samples, we associate BGCs encoding bacteriocins with different human body sites and diseases, thereby elucidating a possible novel role of this antibiotic class in maintaining the stability of microbial ecosystems throughout the human body. Furthermore, we report for the first time gut microbial BGC associations shared among multiple pathologies. Ultimately, we expect our tool to facilitate future investigations into the chemical ecology of microbial communities across diverse niches and pathologies.
format Online
Article
Text
id pubmed-9765181
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher American Society for Microbiology
record_format MEDLINE/PubMed
spelling pubmed-97651812022-12-21 TaxiBGC: a Taxonomy-Guided Approach for Profiling Experimentally Characterized Microbial Biosynthetic Gene Clusters and Secondary Metabolite Production Potential in Metagenomes Gupta, Vinod K. Bakshi, Utpal Chang, Daniel Lee, Aileen R. Davis, John M. Chandrasekaran, Sriram Jin, Yong-Su Freeman, Michael F. Sung, Jaeyun mSystems Methods and Protocols Biosynthetic gene clusters (BGCs) in microbial genomes encode bioactive secondary metabolites (SMs), which can play important roles in microbe-microbe and host-microbe interactions. Given the biological significance of SMs and the current profound interest in the metabolic functions of microbiomes, the unbiased identification of BGCs from high-throughput metagenomic data could offer novel insights into the complex chemical ecology of microbial communities. Currently available tools for predicting BGCs from shotgun metagenomes have several limitations, including the need for computationally demanding read assembly, predicting a narrow breadth of BGC classes, and not providing the SM product. To overcome these limitations, we developed taxonomy-guided identification of biosynthetic gene clusters (TaxiBGC), a command-line tool for predicting experimentally characterized BGCs (and inferring their known SMs) in metagenomes by first pinpointing the microbial species likely to harbor them. We benchmarked TaxiBGC on various simulated metagenomes, showing that our taxonomy-guided approach could predict BGCs with much-improved performance (mean F(1) score, 0.56; mean PPV score, 0.80) compared with directly identifying BGCs by mapping sequencing reads onto the BGC genes (mean F(1) score, 0.49; mean PPV score, 0.41). Next, by applying TaxiBGC on 2,650 metagenomes from the Human Microbiome Project and various case-control gut microbiome studies, we were able to associate BGCs (and their SMs) with different human body sites and with multiple diseases, including Crohn’s disease and liver cirrhosis. In all, TaxiBGC provides an in silico platform to predict experimentally characterized BGCs and their SM production potential in metagenomic data while demonstrating important advantages over existing techniques. IMPORTANCE Currently available bioinformatics tools to identify BGCs from metagenomic sequencing data are limited in their predictive capability or ease of use to even computationally oriented researchers. We present an automated computational pipeline called TaxiBGC, which predicts experimentally characterized BGCs (and infers their known SMs) in shotgun metagenomes by first considering the microbial species source. Through rigorous benchmarking techniques on simulated metagenomes, we show that TaxiBGC provides a significant advantage over existing methods. When demonstrating TaxiBGC on thousands of human microbiome samples, we associate BGCs encoding bacteriocins with different human body sites and diseases, thereby elucidating a possible novel role of this antibiotic class in maintaining the stability of microbial ecosystems throughout the human body. Furthermore, we report for the first time gut microbial BGC associations shared among multiple pathologies. Ultimately, we expect our tool to facilitate future investigations into the chemical ecology of microbial communities across diverse niches and pathologies. American Society for Microbiology 2022-11-15 /pmc/articles/PMC9765181/ /pubmed/36378489 http://dx.doi.org/10.1128/msystems.00925-22 Text en Copyright © 2022 Gupta et al. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International license (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Methods and Protocols
Gupta, Vinod K.
Bakshi, Utpal
Chang, Daniel
Lee, Aileen R.
Davis, John M.
Chandrasekaran, Sriram
Jin, Yong-Su
Freeman, Michael F.
Sung, Jaeyun
TaxiBGC: a Taxonomy-Guided Approach for Profiling Experimentally Characterized Microbial Biosynthetic Gene Clusters and Secondary Metabolite Production Potential in Metagenomes
title TaxiBGC: a Taxonomy-Guided Approach for Profiling Experimentally Characterized Microbial Biosynthetic Gene Clusters and Secondary Metabolite Production Potential in Metagenomes
title_full TaxiBGC: a Taxonomy-Guided Approach for Profiling Experimentally Characterized Microbial Biosynthetic Gene Clusters and Secondary Metabolite Production Potential in Metagenomes
title_fullStr TaxiBGC: a Taxonomy-Guided Approach for Profiling Experimentally Characterized Microbial Biosynthetic Gene Clusters and Secondary Metabolite Production Potential in Metagenomes
title_full_unstemmed TaxiBGC: a Taxonomy-Guided Approach for Profiling Experimentally Characterized Microbial Biosynthetic Gene Clusters and Secondary Metabolite Production Potential in Metagenomes
title_short TaxiBGC: a Taxonomy-Guided Approach for Profiling Experimentally Characterized Microbial Biosynthetic Gene Clusters and Secondary Metabolite Production Potential in Metagenomes
title_sort taxibgc: a taxonomy-guided approach for profiling experimentally characterized microbial biosynthetic gene clusters and secondary metabolite production potential in metagenomes
topic Methods and Protocols
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9765181/
https://www.ncbi.nlm.nih.gov/pubmed/36378489
http://dx.doi.org/10.1128/msystems.00925-22
work_keys_str_mv AT guptavinodk taxibgcataxonomyguidedapproachforprofilingexperimentallycharacterizedmicrobialbiosyntheticgeneclustersandsecondarymetaboliteproductionpotentialinmetagenomes
AT bakshiutpal taxibgcataxonomyguidedapproachforprofilingexperimentallycharacterizedmicrobialbiosyntheticgeneclustersandsecondarymetaboliteproductionpotentialinmetagenomes
AT changdaniel taxibgcataxonomyguidedapproachforprofilingexperimentallycharacterizedmicrobialbiosyntheticgeneclustersandsecondarymetaboliteproductionpotentialinmetagenomes
AT leeaileenr taxibgcataxonomyguidedapproachforprofilingexperimentallycharacterizedmicrobialbiosyntheticgeneclustersandsecondarymetaboliteproductionpotentialinmetagenomes
AT davisjohnm taxibgcataxonomyguidedapproachforprofilingexperimentallycharacterizedmicrobialbiosyntheticgeneclustersandsecondarymetaboliteproductionpotentialinmetagenomes
AT chandrasekaransriram taxibgcataxonomyguidedapproachforprofilingexperimentallycharacterizedmicrobialbiosyntheticgeneclustersandsecondarymetaboliteproductionpotentialinmetagenomes
AT jinyongsu taxibgcataxonomyguidedapproachforprofilingexperimentallycharacterizedmicrobialbiosyntheticgeneclustersandsecondarymetaboliteproductionpotentialinmetagenomes
AT freemanmichaelf taxibgcataxonomyguidedapproachforprofilingexperimentallycharacterizedmicrobialbiosyntheticgeneclustersandsecondarymetaboliteproductionpotentialinmetagenomes
AT sungjaeyun taxibgcataxonomyguidedapproachforprofilingexperimentallycharacterizedmicrobialbiosyntheticgeneclustersandsecondarymetaboliteproductionpotentialinmetagenomes