Cargando…

Inference of phenotype-defining functional modules of protein families for microbial plant biomass degraders

BACKGROUND: Efficient industrial processes for converting plant lignocellulosic materials into biofuels are a key to global efforts to come up with alternative energy sources to fossil fuels. Novel cellulolytic enzymes have been discovered in microbial genomes and metagenomes of microbial communitie...

Descripción completa

Detalles Bibliográficos
Autores principales: Konietzny, Sebastian GA, Pope, Phillip B, Weimann, Aaron, McHardy, Alice C
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4189754/
https://www.ncbi.nlm.nih.gov/pubmed/25342967
http://dx.doi.org/10.1186/s13068-014-0124-8
_version_ 1782338415170158592
author Konietzny, Sebastian GA
Pope, Phillip B
Weimann, Aaron
McHardy, Alice C
author_facet Konietzny, Sebastian GA
Pope, Phillip B
Weimann, Aaron
McHardy, Alice C
author_sort Konietzny, Sebastian GA
collection PubMed
description BACKGROUND: Efficient industrial processes for converting plant lignocellulosic materials into biofuels are a key to global efforts to come up with alternative energy sources to fossil fuels. Novel cellulolytic enzymes have been discovered in microbial genomes and metagenomes of microbial communities. However, the identification of relevant genes without known homologs, and the elucidation of the lignocellulolytic pathways and protein complexes for different microorganisms remain challenging. RESULTS: We describe a new computational method for the targeted discovery of functional modules of plant biomass-degrading protein families, based on their co-occurrence patterns across genomes and metagenome datasets, and the strength of association of these modules with the genomes of known degraders. From approximately 6.4 million family annotations for 2,884 microbial genomes, and 332 taxonomic bins from 18 metagenomes, we identified 5 functional modules that are distinctive for plant biomass degraders, which we term “plant biomass degradation modules” (PDMs). These modules incorporate protein families involved in the degradation of cellulose, hemicelluloses, and pectins, structural components of the cellulosome, and additional families with potential functions in plant biomass degradation. The PDMs were linked to 81 gene clusters in genomes of known lignocellulose degraders, including previously described clusters of lignocellulolytic genes. On average, 70% of the families of each PDM were found to map to gene clusters in known degraders, which served as an additional confirmation of their functional relationships. The presence of a PDM in a genome or taxonomic metagenome bin furthermore allowed us to accurately predict the ability of any particular organism to degrade plant biomass. For 15 draft genomes of a cow rumen metagenome, we used cross-referencing to confirmed cellulolytic enzymes to validate that the PDMs identified plant biomass degraders within a complex microbial community. CONCLUSIONS: Functional modules of protein families that are involved in different aspects of plant cell wall degradation can be inferred from co-occurrence patterns across (meta-)genomes with a probabilistic topic model. PDMs represent a new resource of protein families and candidate genes implicated in microbial plant biomass degradation. They can also be used to predict the plant biomass degradation ability for a genome or taxonomic bin. The method is also suitable for characterizing other microbial phenotypes. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13068-014-0124-8) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4189754
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-41897542014-10-23 Inference of phenotype-defining functional modules of protein families for microbial plant biomass degraders Konietzny, Sebastian GA Pope, Phillip B Weimann, Aaron McHardy, Alice C Biotechnol Biofuels Research Article BACKGROUND: Efficient industrial processes for converting plant lignocellulosic materials into biofuels are a key to global efforts to come up with alternative energy sources to fossil fuels. Novel cellulolytic enzymes have been discovered in microbial genomes and metagenomes of microbial communities. However, the identification of relevant genes without known homologs, and the elucidation of the lignocellulolytic pathways and protein complexes for different microorganisms remain challenging. RESULTS: We describe a new computational method for the targeted discovery of functional modules of plant biomass-degrading protein families, based on their co-occurrence patterns across genomes and metagenome datasets, and the strength of association of these modules with the genomes of known degraders. From approximately 6.4 million family annotations for 2,884 microbial genomes, and 332 taxonomic bins from 18 metagenomes, we identified 5 functional modules that are distinctive for plant biomass degraders, which we term “plant biomass degradation modules” (PDMs). These modules incorporate protein families involved in the degradation of cellulose, hemicelluloses, and pectins, structural components of the cellulosome, and additional families with potential functions in plant biomass degradation. The PDMs were linked to 81 gene clusters in genomes of known lignocellulose degraders, including previously described clusters of lignocellulolytic genes. On average, 70% of the families of each PDM were found to map to gene clusters in known degraders, which served as an additional confirmation of their functional relationships. The presence of a PDM in a genome or taxonomic metagenome bin furthermore allowed us to accurately predict the ability of any particular organism to degrade plant biomass. For 15 draft genomes of a cow rumen metagenome, we used cross-referencing to confirmed cellulolytic enzymes to validate that the PDMs identified plant biomass degraders within a complex microbial community. CONCLUSIONS: Functional modules of protein families that are involved in different aspects of plant cell wall degradation can be inferred from co-occurrence patterns across (meta-)genomes with a probabilistic topic model. PDMs represent a new resource of protein families and candidate genes implicated in microbial plant biomass degradation. They can also be used to predict the plant biomass degradation ability for a genome or taxonomic bin. The method is also suitable for characterizing other microbial phenotypes. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13068-014-0124-8) contains supplementary material, which is available to authorized users. BioMed Central 2014-09-09 /pmc/articles/PMC4189754/ /pubmed/25342967 http://dx.doi.org/10.1186/s13068-014-0124-8 Text en © Konietzny et al.; licensee BioMed Central Ltd. 2014 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Konietzny, Sebastian GA
Pope, Phillip B
Weimann, Aaron
McHardy, Alice C
Inference of phenotype-defining functional modules of protein families for microbial plant biomass degraders
title Inference of phenotype-defining functional modules of protein families for microbial plant biomass degraders
title_full Inference of phenotype-defining functional modules of protein families for microbial plant biomass degraders
title_fullStr Inference of phenotype-defining functional modules of protein families for microbial plant biomass degraders
title_full_unstemmed Inference of phenotype-defining functional modules of protein families for microbial plant biomass degraders
title_short Inference of phenotype-defining functional modules of protein families for microbial plant biomass degraders
title_sort inference of phenotype-defining functional modules of protein families for microbial plant biomass degraders
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4189754/
https://www.ncbi.nlm.nih.gov/pubmed/25342967
http://dx.doi.org/10.1186/s13068-014-0124-8
work_keys_str_mv AT konietznysebastianga inferenceofphenotypedefiningfunctionalmodulesofproteinfamiliesformicrobialplantbiomassdegraders
AT popephillipb inferenceofphenotypedefiningfunctionalmodulesofproteinfamiliesformicrobialplantbiomassdegraders
AT weimannaaron inferenceofphenotypedefiningfunctionalmodulesofproteinfamiliesformicrobialplantbiomassdegraders
AT mchardyalicec inferenceofphenotypedefiningfunctionalmodulesofproteinfamiliesformicrobialplantbiomassdegraders