Cargando…
Phylogenetic approaches to microbial community classification
BACKGROUND: The microbiota from different body sites are dominated by different major groups of microbes, but the variations within a body site such as the mouth can be more subtle. Accurate predictive models can serve as useful tools for distinguishing sub-sites and understanding key organisms and...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4593236/ https://www.ncbi.nlm.nih.gov/pubmed/26437943 http://dx.doi.org/10.1186/s40168-015-0114-5 |
_version_ | 1782393301085716480 |
---|---|
author | Ning, Jie Beiko, Robert G. |
author_facet | Ning, Jie Beiko, Robert G. |
author_sort | Ning, Jie |
collection | PubMed |
description | BACKGROUND: The microbiota from different body sites are dominated by different major groups of microbes, but the variations within a body site such as the mouth can be more subtle. Accurate predictive models can serve as useful tools for distinguishing sub-sites and understanding key organisms and their roles and can highlight deviations from expected distributions of microbes. Good classification depends on choosing the right combination of classifier, feature representation, and learning model. Machine-learning procedures have been used in the past for supervised classification, but increased attention to feature representation and selection may produce better models and predictions. RESULTS: We focused our attention on the classification of nine oral sites and dental plaque in particular, using data collected from the Human Microbiome Project. A key focus of our representations was the use of phylogenetic information, both as the basis for custom kernels and as a way to represent sets of microbes to the classifier. We also used the PICRUSt software, which draws on phylogenetic relationships to predict molecular functions and to generate additional features for the classifier. Custom kernels based on the UniFrac measure of community dissimilarity did not improve performance. However, feature representation was vital to classification accuracy, with microbial clade and function representations providing useful information to the classifier; combining the two types of features did not yield increased prediction accuracy. Many of the best-performing clades and functions had clear associations with oral microflora. CONCLUSIONS: The classification of oral microbiota remains a challenging problem; our best accuracy on the plaque dataset was approximately 81 %. Perfect accuracy may be unattainable due to the close proximity of the sites and intra-individual variation. However, further exploration of the space of both classifiers and feature representations is likely to increase the accuracy of predictive models. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s40168-015-0114-5) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-4593236 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-45932362015-10-06 Phylogenetic approaches to microbial community classification Ning, Jie Beiko, Robert G. Microbiome Research BACKGROUND: The microbiota from different body sites are dominated by different major groups of microbes, but the variations within a body site such as the mouth can be more subtle. Accurate predictive models can serve as useful tools for distinguishing sub-sites and understanding key organisms and their roles and can highlight deviations from expected distributions of microbes. Good classification depends on choosing the right combination of classifier, feature representation, and learning model. Machine-learning procedures have been used in the past for supervised classification, but increased attention to feature representation and selection may produce better models and predictions. RESULTS: We focused our attention on the classification of nine oral sites and dental plaque in particular, using data collected from the Human Microbiome Project. A key focus of our representations was the use of phylogenetic information, both as the basis for custom kernels and as a way to represent sets of microbes to the classifier. We also used the PICRUSt software, which draws on phylogenetic relationships to predict molecular functions and to generate additional features for the classifier. Custom kernels based on the UniFrac measure of community dissimilarity did not improve performance. However, feature representation was vital to classification accuracy, with microbial clade and function representations providing useful information to the classifier; combining the two types of features did not yield increased prediction accuracy. Many of the best-performing clades and functions had clear associations with oral microflora. CONCLUSIONS: The classification of oral microbiota remains a challenging problem; our best accuracy on the plaque dataset was approximately 81 %. Perfect accuracy may be unattainable due to the close proximity of the sites and intra-individual variation. However, further exploration of the space of both classifiers and feature representations is likely to increase the accuracy of predictive models. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s40168-015-0114-5) contains supplementary material, which is available to authorized users. BioMed Central 2015-10-05 /pmc/articles/PMC4593236/ /pubmed/26437943 http://dx.doi.org/10.1186/s40168-015-0114-5 Text en © Ning and Beiko. 2015 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Research Ning, Jie Beiko, Robert G. Phylogenetic approaches to microbial community classification |
title | Phylogenetic approaches to microbial community classification |
title_full | Phylogenetic approaches to microbial community classification |
title_fullStr | Phylogenetic approaches to microbial community classification |
title_full_unstemmed | Phylogenetic approaches to microbial community classification |
title_short | Phylogenetic approaches to microbial community classification |
title_sort | phylogenetic approaches to microbial community classification |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4593236/ https://www.ncbi.nlm.nih.gov/pubmed/26437943 http://dx.doi.org/10.1186/s40168-015-0114-5 |
work_keys_str_mv | AT ningjie phylogeneticapproachestomicrobialcommunityclassification AT beikorobertg phylogeneticapproachestomicrobialcommunityclassification |