Cargando…

BioMiCo: a supervised Bayesian model for inference of microbial community structure

BACKGROUND: Microbiome samples often represent mixtures of communities, where each community is composed of overlapping assemblages of species. Such mixtures are complex, the number of species is huge and abundance information for many species is often sparse. Classical methods have a limited value...

Descripción completa

Detalles Bibliográficos
Autores principales: Shafiei, Mahdi, Dunn, Katherine A, Boon, Eva, MacDonald, Shelley M, Walsh, David A, Gu, Hong, Bielawski, Joseph P
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4359585/
https://www.ncbi.nlm.nih.gov/pubmed/25774293
http://dx.doi.org/10.1186/s40168-015-0073-x
_version_ 1782361437777166336
author Shafiei, Mahdi
Dunn, Katherine A
Boon, Eva
MacDonald, Shelley M
Walsh, David A
Gu, Hong
Bielawski, Joseph P
author_facet Shafiei, Mahdi
Dunn, Katherine A
Boon, Eva
MacDonald, Shelley M
Walsh, David A
Gu, Hong
Bielawski, Joseph P
author_sort Shafiei, Mahdi
collection PubMed
description BACKGROUND: Microbiome samples often represent mixtures of communities, where each community is composed of overlapping assemblages of species. Such mixtures are complex, the number of species is huge and abundance information for many species is often sparse. Classical methods have a limited value for identifying complex features within such data. RESULTS: Here, we describe a novel hierarchical model for Bayesian inference of microbial communities (BioMiCo). The model takes abundance data derived from environmental DNA, and models the composition of each sample by a two-level hierarchy of mixture distributions constrained by Dirichlet priors. BioMiCo is supervised, using known features for samples and appropriate prior constraints to overcome the challenges posed by many variables, sparse data, and large numbers of rare species. The model is trained on a portion of the data, where it learns how assemblages of species are mixed to form communities and how assemblages are related to the known features of each sample. Training yields a model that can predict the features of new samples. We used BioMiCo to build models for three serially sampled datasets and tested their predictive accuracy across different time points. The first model was trained to predict both body site (hand, mouth, and gut) and individual human host. It was able to reliably distinguish these features across different time points. The second was trained on vaginal microbiomes to predict both the Nugent score and individual human host. We found that women having normal and elevated Nugent scores had distinct microbiome structures that persisted over time, with additional structure within women having elevated scores. The third was trained for the purpose of assessing seasonal transitions in a coastal bacterial community. Application of this model to a high-resolution time series permitted us to track the rate and time of community succession and accurately predict known ecosystem-level events. CONCLUSION: BioMiCo provides a framework for learning the structure of microbial communities and for making predictions based on microbial assemblages. By training on carefully chosen features (abiotic or biotic), BioMiCo can be used to understand and predict transitions between complex communities composed of hundreds of microbial species. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s40168-015-0073-x) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4359585
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-43595852015-03-15 BioMiCo: a supervised Bayesian model for inference of microbial community structure Shafiei, Mahdi Dunn, Katherine A Boon, Eva MacDonald, Shelley M Walsh, David A Gu, Hong Bielawski, Joseph P Microbiome Methodology BACKGROUND: Microbiome samples often represent mixtures of communities, where each community is composed of overlapping assemblages of species. Such mixtures are complex, the number of species is huge and abundance information for many species is often sparse. Classical methods have a limited value for identifying complex features within such data. RESULTS: Here, we describe a novel hierarchical model for Bayesian inference of microbial communities (BioMiCo). The model takes abundance data derived from environmental DNA, and models the composition of each sample by a two-level hierarchy of mixture distributions constrained by Dirichlet priors. BioMiCo is supervised, using known features for samples and appropriate prior constraints to overcome the challenges posed by many variables, sparse data, and large numbers of rare species. The model is trained on a portion of the data, where it learns how assemblages of species are mixed to form communities and how assemblages are related to the known features of each sample. Training yields a model that can predict the features of new samples. We used BioMiCo to build models for three serially sampled datasets and tested their predictive accuracy across different time points. The first model was trained to predict both body site (hand, mouth, and gut) and individual human host. It was able to reliably distinguish these features across different time points. The second was trained on vaginal microbiomes to predict both the Nugent score and individual human host. We found that women having normal and elevated Nugent scores had distinct microbiome structures that persisted over time, with additional structure within women having elevated scores. The third was trained for the purpose of assessing seasonal transitions in a coastal bacterial community. Application of this model to a high-resolution time series permitted us to track the rate and time of community succession and accurately predict known ecosystem-level events. CONCLUSION: BioMiCo provides a framework for learning the structure of microbial communities and for making predictions based on microbial assemblages. By training on carefully chosen features (abiotic or biotic), BioMiCo can be used to understand and predict transitions between complex communities composed of hundreds of microbial species. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s40168-015-0073-x) contains supplementary material, which is available to authorized users. BioMed Central 2015-03-10 /pmc/articles/PMC4359585/ /pubmed/25774293 http://dx.doi.org/10.1186/s40168-015-0073-x Text en © Shafiei et al.; licensee BioMed Central. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology
Shafiei, Mahdi
Dunn, Katherine A
Boon, Eva
MacDonald, Shelley M
Walsh, David A
Gu, Hong
Bielawski, Joseph P
BioMiCo: a supervised Bayesian model for inference of microbial community structure
title BioMiCo: a supervised Bayesian model for inference of microbial community structure
title_full BioMiCo: a supervised Bayesian model for inference of microbial community structure
title_fullStr BioMiCo: a supervised Bayesian model for inference of microbial community structure
title_full_unstemmed BioMiCo: a supervised Bayesian model for inference of microbial community structure
title_short BioMiCo: a supervised Bayesian model for inference of microbial community structure
title_sort biomico: a supervised bayesian model for inference of microbial community structure
topic Methodology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4359585/
https://www.ncbi.nlm.nih.gov/pubmed/25774293
http://dx.doi.org/10.1186/s40168-015-0073-x
work_keys_str_mv AT shafieimahdi biomicoasupervisedbayesianmodelforinferenceofmicrobialcommunitystructure
AT dunnkatherinea biomicoasupervisedbayesianmodelforinferenceofmicrobialcommunitystructure
AT booneva biomicoasupervisedbayesianmodelforinferenceofmicrobialcommunitystructure
AT macdonaldshelleym biomicoasupervisedbayesianmodelforinferenceofmicrobialcommunitystructure
AT walshdavida biomicoasupervisedbayesianmodelforinferenceofmicrobialcommunitystructure
AT guhong biomicoasupervisedbayesianmodelforinferenceofmicrobialcommunitystructure
AT bielawskijosephp biomicoasupervisedbayesianmodelforinferenceofmicrobialcommunitystructure