Cargando…
A systematic machine learning and data type comparison yields metagenomic predictors of infant age, sex, breastfeeding, antibiotic usage, country of origin, and delivery type
The microbiome is a new frontier for building predictors of human phenotypes. However, machine learning in the microbiome is fraught with issues of reproducibility, driven in large part by the wide range of analytic models and metagenomic data types available. We aimed to build robust metagenomic pr...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7241849/ https://www.ncbi.nlm.nih.gov/pubmed/32392251 http://dx.doi.org/10.1371/journal.pcbi.1007895 |
_version_ | 1783537143811932160 |
---|---|
author | Le Goallec, Alan Tierney, Braden T. Luber, Jacob M. Cofer, Evan M. Kostic, Aleksandar D. Patel, Chirag J. |
author_facet | Le Goallec, Alan Tierney, Braden T. Luber, Jacob M. Cofer, Evan M. Kostic, Aleksandar D. Patel, Chirag J. |
author_sort | Le Goallec, Alan |
collection | PubMed |
description | The microbiome is a new frontier for building predictors of human phenotypes. However, machine learning in the microbiome is fraught with issues of reproducibility, driven in large part by the wide range of analytic models and metagenomic data types available. We aimed to build robust metagenomic predictors of host phenotype by comparing prediction performances and biological interpretation across 8 machine learning methods and 4 different types of metagenomic data. Using 1,570 samples from 300 infants, we fit 7,865 models for 6 host phenotypes. We demonstrate the dependence of accuracy on algorithm choice and feature definition in microbiome data and propose a framework for building microbiome-derived indicators of host phenotype. We additionally identify biological features predictive of age, sex, breastfeeding status, historical antibiotic usage, country of origin, and delivery type. Our complete results can be viewed at http://apps.chiragjpgroup.org/ubiome_predictions/. |
format | Online Article Text |
id | pubmed-7241849 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-72418492020-06-03 A systematic machine learning and data type comparison yields metagenomic predictors of infant age, sex, breastfeeding, antibiotic usage, country of origin, and delivery type Le Goallec, Alan Tierney, Braden T. Luber, Jacob M. Cofer, Evan M. Kostic, Aleksandar D. Patel, Chirag J. PLoS Comput Biol Research Article The microbiome is a new frontier for building predictors of human phenotypes. However, machine learning in the microbiome is fraught with issues of reproducibility, driven in large part by the wide range of analytic models and metagenomic data types available. We aimed to build robust metagenomic predictors of host phenotype by comparing prediction performances and biological interpretation across 8 machine learning methods and 4 different types of metagenomic data. Using 1,570 samples from 300 infants, we fit 7,865 models for 6 host phenotypes. We demonstrate the dependence of accuracy on algorithm choice and feature definition in microbiome data and propose a framework for building microbiome-derived indicators of host phenotype. We additionally identify biological features predictive of age, sex, breastfeeding status, historical antibiotic usage, country of origin, and delivery type. Our complete results can be viewed at http://apps.chiragjpgroup.org/ubiome_predictions/. Public Library of Science 2020-05-11 /pmc/articles/PMC7241849/ /pubmed/32392251 http://dx.doi.org/10.1371/journal.pcbi.1007895 Text en © 2020 Le Goallec et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Research Article Le Goallec, Alan Tierney, Braden T. Luber, Jacob M. Cofer, Evan M. Kostic, Aleksandar D. Patel, Chirag J. A systematic machine learning and data type comparison yields metagenomic predictors of infant age, sex, breastfeeding, antibiotic usage, country of origin, and delivery type |
title | A systematic machine learning and data type comparison yields metagenomic predictors of infant age, sex, breastfeeding, antibiotic usage, country of origin, and delivery type |
title_full | A systematic machine learning and data type comparison yields metagenomic predictors of infant age, sex, breastfeeding, antibiotic usage, country of origin, and delivery type |
title_fullStr | A systematic machine learning and data type comparison yields metagenomic predictors of infant age, sex, breastfeeding, antibiotic usage, country of origin, and delivery type |
title_full_unstemmed | A systematic machine learning and data type comparison yields metagenomic predictors of infant age, sex, breastfeeding, antibiotic usage, country of origin, and delivery type |
title_short | A systematic machine learning and data type comparison yields metagenomic predictors of infant age, sex, breastfeeding, antibiotic usage, country of origin, and delivery type |
title_sort | systematic machine learning and data type comparison yields metagenomic predictors of infant age, sex, breastfeeding, antibiotic usage, country of origin, and delivery type |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7241849/ https://www.ncbi.nlm.nih.gov/pubmed/32392251 http://dx.doi.org/10.1371/journal.pcbi.1007895 |
work_keys_str_mv | AT legoallecalan asystematicmachinelearninganddatatypecomparisonyieldsmetagenomicpredictorsofinfantagesexbreastfeedingantibioticusagecountryoforiginanddeliverytype AT tierneybradent asystematicmachinelearninganddatatypecomparisonyieldsmetagenomicpredictorsofinfantagesexbreastfeedingantibioticusagecountryoforiginanddeliverytype AT luberjacobm asystematicmachinelearninganddatatypecomparisonyieldsmetagenomicpredictorsofinfantagesexbreastfeedingantibioticusagecountryoforiginanddeliverytype AT coferevanm asystematicmachinelearninganddatatypecomparisonyieldsmetagenomicpredictorsofinfantagesexbreastfeedingantibioticusagecountryoforiginanddeliverytype AT kosticaleksandard asystematicmachinelearninganddatatypecomparisonyieldsmetagenomicpredictorsofinfantagesexbreastfeedingantibioticusagecountryoforiginanddeliverytype AT patelchiragj asystematicmachinelearninganddatatypecomparisonyieldsmetagenomicpredictorsofinfantagesexbreastfeedingantibioticusagecountryoforiginanddeliverytype AT legoallecalan systematicmachinelearninganddatatypecomparisonyieldsmetagenomicpredictorsofinfantagesexbreastfeedingantibioticusagecountryoforiginanddeliverytype AT tierneybradent systematicmachinelearninganddatatypecomparisonyieldsmetagenomicpredictorsofinfantagesexbreastfeedingantibioticusagecountryoforiginanddeliverytype AT luberjacobm systematicmachinelearninganddatatypecomparisonyieldsmetagenomicpredictorsofinfantagesexbreastfeedingantibioticusagecountryoforiginanddeliverytype AT coferevanm systematicmachinelearninganddatatypecomparisonyieldsmetagenomicpredictorsofinfantagesexbreastfeedingantibioticusagecountryoforiginanddeliverytype AT kosticaleksandard systematicmachinelearninganddatatypecomparisonyieldsmetagenomicpredictorsofinfantagesexbreastfeedingantibioticusagecountryoforiginanddeliverytype AT patelchiragj systematicmachinelearninganddatatypecomparisonyieldsmetagenomicpredictorsofinfantagesexbreastfeedingantibioticusagecountryoforiginanddeliverytype |