Cargando…

Phylogenetic approaches to microbial community classification

BACKGROUND: The microbiota from different body sites are dominated by different major groups of microbes, but the variations within a body site such as the mouth can be more subtle. Accurate predictive models can serve as useful tools for distinguishing sub-sites and understanding key organisms and...

Descripción completa

Detalles Bibliográficos
Autores principales: Ning, Jie, Beiko, Robert G.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4593236/
https://www.ncbi.nlm.nih.gov/pubmed/26437943
http://dx.doi.org/10.1186/s40168-015-0114-5
_version_ 1782393301085716480
author Ning, Jie
Beiko, Robert G.
author_facet Ning, Jie
Beiko, Robert G.
author_sort Ning, Jie
collection PubMed
description BACKGROUND: The microbiota from different body sites are dominated by different major groups of microbes, but the variations within a body site such as the mouth can be more subtle. Accurate predictive models can serve as useful tools for distinguishing sub-sites and understanding key organisms and their roles and can highlight deviations from expected distributions of microbes. Good classification depends on choosing the right combination of classifier, feature representation, and learning model. Machine-learning procedures have been used in the past for supervised classification, but increased attention to feature representation and selection may produce better models and predictions. RESULTS: We focused our attention on the classification of nine oral sites and dental plaque in particular, using data collected from the Human Microbiome Project. A key focus of our representations was the use of phylogenetic information, both as the basis for custom kernels and as a way to represent sets of microbes to the classifier. We also used the PICRUSt software, which draws on phylogenetic relationships to predict molecular functions and to generate additional features for the classifier. Custom kernels based on the UniFrac measure of community dissimilarity did not improve performance. However, feature representation was vital to classification accuracy, with microbial clade and function representations providing useful information to the classifier; combining the two types of features did not yield increased prediction accuracy. Many of the best-performing clades and functions had clear associations with oral microflora. CONCLUSIONS: The classification of oral microbiota remains a challenging problem; our best accuracy on the plaque dataset was approximately 81 %. Perfect accuracy may be unattainable due to the close proximity of the sites and intra-individual variation. However, further exploration of the space of both classifiers and feature representations is likely to increase the accuracy of predictive models. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s40168-015-0114-5) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4593236
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-45932362015-10-06 Phylogenetic approaches to microbial community classification Ning, Jie Beiko, Robert G. Microbiome Research BACKGROUND: The microbiota from different body sites are dominated by different major groups of microbes, but the variations within a body site such as the mouth can be more subtle. Accurate predictive models can serve as useful tools for distinguishing sub-sites and understanding key organisms and their roles and can highlight deviations from expected distributions of microbes. Good classification depends on choosing the right combination of classifier, feature representation, and learning model. Machine-learning procedures have been used in the past for supervised classification, but increased attention to feature representation and selection may produce better models and predictions. RESULTS: We focused our attention on the classification of nine oral sites and dental plaque in particular, using data collected from the Human Microbiome Project. A key focus of our representations was the use of phylogenetic information, both as the basis for custom kernels and as a way to represent sets of microbes to the classifier. We also used the PICRUSt software, which draws on phylogenetic relationships to predict molecular functions and to generate additional features for the classifier. Custom kernels based on the UniFrac measure of community dissimilarity did not improve performance. However, feature representation was vital to classification accuracy, with microbial clade and function representations providing useful information to the classifier; combining the two types of features did not yield increased prediction accuracy. Many of the best-performing clades and functions had clear associations with oral microflora. CONCLUSIONS: The classification of oral microbiota remains a challenging problem; our best accuracy on the plaque dataset was approximately 81 %. Perfect accuracy may be unattainable due to the close proximity of the sites and intra-individual variation. However, further exploration of the space of both classifiers and feature representations is likely to increase the accuracy of predictive models. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s40168-015-0114-5) contains supplementary material, which is available to authorized users. BioMed Central 2015-10-05 /pmc/articles/PMC4593236/ /pubmed/26437943 http://dx.doi.org/10.1186/s40168-015-0114-5 Text en © Ning and Beiko. 2015 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Ning, Jie
Beiko, Robert G.
Phylogenetic approaches to microbial community classification
title Phylogenetic approaches to microbial community classification
title_full Phylogenetic approaches to microbial community classification
title_fullStr Phylogenetic approaches to microbial community classification
title_full_unstemmed Phylogenetic approaches to microbial community classification
title_short Phylogenetic approaches to microbial community classification
title_sort phylogenetic approaches to microbial community classification
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4593236/
https://www.ncbi.nlm.nih.gov/pubmed/26437943
http://dx.doi.org/10.1186/s40168-015-0114-5
work_keys_str_mv AT ningjie phylogeneticapproachestomicrobialcommunityclassification
AT beikorobertg phylogeneticapproachestomicrobialcommunityclassification