Cargando…

A systematic machine learning and data type comparison yields metagenomic predictors of infant age, sex, breastfeeding, antibiotic usage, country of origin, and delivery type

The microbiome is a new frontier for building predictors of human phenotypes. However, machine learning in the microbiome is fraught with issues of reproducibility, driven in large part by the wide range of analytic models and metagenomic data types available. We aimed to build robust metagenomic pr...

Descripción completa

Detalles Bibliográficos
Autores principales: Le Goallec, Alan, Tierney, Braden T., Luber, Jacob M., Cofer, Evan M., Kostic, Aleksandar D., Patel, Chirag J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7241849/
https://www.ncbi.nlm.nih.gov/pubmed/32392251
http://dx.doi.org/10.1371/journal.pcbi.1007895
_version_ 1783537143811932160
author Le Goallec, Alan
Tierney, Braden T.
Luber, Jacob M.
Cofer, Evan M.
Kostic, Aleksandar D.
Patel, Chirag J.
author_facet Le Goallec, Alan
Tierney, Braden T.
Luber, Jacob M.
Cofer, Evan M.
Kostic, Aleksandar D.
Patel, Chirag J.
author_sort Le Goallec, Alan
collection PubMed
description The microbiome is a new frontier for building predictors of human phenotypes. However, machine learning in the microbiome is fraught with issues of reproducibility, driven in large part by the wide range of analytic models and metagenomic data types available. We aimed to build robust metagenomic predictors of host phenotype by comparing prediction performances and biological interpretation across 8 machine learning methods and 4 different types of metagenomic data. Using 1,570 samples from 300 infants, we fit 7,865 models for 6 host phenotypes. We demonstrate the dependence of accuracy on algorithm choice and feature definition in microbiome data and propose a framework for building microbiome-derived indicators of host phenotype. We additionally identify biological features predictive of age, sex, breastfeeding status, historical antibiotic usage, country of origin, and delivery type. Our complete results can be viewed at http://apps.chiragjpgroup.org/ubiome_predictions/.
format Online
Article
Text
id pubmed-7241849
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-72418492020-06-03 A systematic machine learning and data type comparison yields metagenomic predictors of infant age, sex, breastfeeding, antibiotic usage, country of origin, and delivery type Le Goallec, Alan Tierney, Braden T. Luber, Jacob M. Cofer, Evan M. Kostic, Aleksandar D. Patel, Chirag J. PLoS Comput Biol Research Article The microbiome is a new frontier for building predictors of human phenotypes. However, machine learning in the microbiome is fraught with issues of reproducibility, driven in large part by the wide range of analytic models and metagenomic data types available. We aimed to build robust metagenomic predictors of host phenotype by comparing prediction performances and biological interpretation across 8 machine learning methods and 4 different types of metagenomic data. Using 1,570 samples from 300 infants, we fit 7,865 models for 6 host phenotypes. We demonstrate the dependence of accuracy on algorithm choice and feature definition in microbiome data and propose a framework for building microbiome-derived indicators of host phenotype. We additionally identify biological features predictive of age, sex, breastfeeding status, historical antibiotic usage, country of origin, and delivery type. Our complete results can be viewed at http://apps.chiragjpgroup.org/ubiome_predictions/. Public Library of Science 2020-05-11 /pmc/articles/PMC7241849/ /pubmed/32392251 http://dx.doi.org/10.1371/journal.pcbi.1007895 Text en © 2020 Le Goallec et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Le Goallec, Alan
Tierney, Braden T.
Luber, Jacob M.
Cofer, Evan M.
Kostic, Aleksandar D.
Patel, Chirag J.
A systematic machine learning and data type comparison yields metagenomic predictors of infant age, sex, breastfeeding, antibiotic usage, country of origin, and delivery type
title A systematic machine learning and data type comparison yields metagenomic predictors of infant age, sex, breastfeeding, antibiotic usage, country of origin, and delivery type
title_full A systematic machine learning and data type comparison yields metagenomic predictors of infant age, sex, breastfeeding, antibiotic usage, country of origin, and delivery type
title_fullStr A systematic machine learning and data type comparison yields metagenomic predictors of infant age, sex, breastfeeding, antibiotic usage, country of origin, and delivery type
title_full_unstemmed A systematic machine learning and data type comparison yields metagenomic predictors of infant age, sex, breastfeeding, antibiotic usage, country of origin, and delivery type
title_short A systematic machine learning and data type comparison yields metagenomic predictors of infant age, sex, breastfeeding, antibiotic usage, country of origin, and delivery type
title_sort systematic machine learning and data type comparison yields metagenomic predictors of infant age, sex, breastfeeding, antibiotic usage, country of origin, and delivery type
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7241849/
https://www.ncbi.nlm.nih.gov/pubmed/32392251
http://dx.doi.org/10.1371/journal.pcbi.1007895
work_keys_str_mv AT legoallecalan asystematicmachinelearninganddatatypecomparisonyieldsmetagenomicpredictorsofinfantagesexbreastfeedingantibioticusagecountryoforiginanddeliverytype
AT tierneybradent asystematicmachinelearninganddatatypecomparisonyieldsmetagenomicpredictorsofinfantagesexbreastfeedingantibioticusagecountryoforiginanddeliverytype
AT luberjacobm asystematicmachinelearninganddatatypecomparisonyieldsmetagenomicpredictorsofinfantagesexbreastfeedingantibioticusagecountryoforiginanddeliverytype
AT coferevanm asystematicmachinelearninganddatatypecomparisonyieldsmetagenomicpredictorsofinfantagesexbreastfeedingantibioticusagecountryoforiginanddeliverytype
AT kosticaleksandard asystematicmachinelearninganddatatypecomparisonyieldsmetagenomicpredictorsofinfantagesexbreastfeedingantibioticusagecountryoforiginanddeliverytype
AT patelchiragj asystematicmachinelearninganddatatypecomparisonyieldsmetagenomicpredictorsofinfantagesexbreastfeedingantibioticusagecountryoforiginanddeliverytype
AT legoallecalan systematicmachinelearninganddatatypecomparisonyieldsmetagenomicpredictorsofinfantagesexbreastfeedingantibioticusagecountryoforiginanddeliverytype
AT tierneybradent systematicmachinelearninganddatatypecomparisonyieldsmetagenomicpredictorsofinfantagesexbreastfeedingantibioticusagecountryoforiginanddeliverytype
AT luberjacobm systematicmachinelearninganddatatypecomparisonyieldsmetagenomicpredictorsofinfantagesexbreastfeedingantibioticusagecountryoforiginanddeliverytype
AT coferevanm systematicmachinelearninganddatatypecomparisonyieldsmetagenomicpredictorsofinfantagesexbreastfeedingantibioticusagecountryoforiginanddeliverytype
AT kosticaleksandard systematicmachinelearninganddatatypecomparisonyieldsmetagenomicpredictorsofinfantagesexbreastfeedingantibioticusagecountryoforiginanddeliverytype
AT patelchiragj systematicmachinelearninganddatatypecomparisonyieldsmetagenomicpredictorsofinfantagesexbreastfeedingantibioticusagecountryoforiginanddeliverytype