Cargando…

Leveraging Scheme for Cross-Study Microbiome Machine Learning Prediction and Feature Evaluations

The microbiota has proved to be one of the critical factors for many diseases, and researchers have been using microbiome data for disease prediction. However, models trained on one independent microbiome study may not be easily applicable to other independent studies due to the high level of variab...

Descripción completa

Detalles Bibliográficos
Autores principales: Song, Kuncheng, Zhou, Yi-Hui
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9952031/
https://www.ncbi.nlm.nih.gov/pubmed/36829725
http://dx.doi.org/10.3390/bioengineering10020231
_version_ 1784893528937070592
author Song, Kuncheng
Zhou, Yi-Hui
author_facet Song, Kuncheng
Zhou, Yi-Hui
author_sort Song, Kuncheng
collection PubMed
description The microbiota has proved to be one of the critical factors for many diseases, and researchers have been using microbiome data for disease prediction. However, models trained on one independent microbiome study may not be easily applicable to other independent studies due to the high level of variability in microbiome data. In this study, we developed a method for improving the generalizability and interpretability of machine learning models for predicting three different diseases (colorectal cancer, Crohn’s disease, and immunotherapy response) using nine independent microbiome datasets. Our method involves combining a smaller dataset with a larger dataset, and we found that using at least 25% of the target samples in the source data resulted in improved model performance. We determined random forest as our top model and employed feature selection to identify common and important taxa for disease prediction across the different studies. Our results suggest that this leveraging scheme is a promising approach for improving the accuracy and interpretability of machine learning models for predicting diseases based on microbiome data.
format Online
Article
Text
id pubmed-9952031
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-99520312023-02-25 Leveraging Scheme for Cross-Study Microbiome Machine Learning Prediction and Feature Evaluations Song, Kuncheng Zhou, Yi-Hui Bioengineering (Basel) Article The microbiota has proved to be one of the critical factors for many diseases, and researchers have been using microbiome data for disease prediction. However, models trained on one independent microbiome study may not be easily applicable to other independent studies due to the high level of variability in microbiome data. In this study, we developed a method for improving the generalizability and interpretability of machine learning models for predicting three different diseases (colorectal cancer, Crohn’s disease, and immunotherapy response) using nine independent microbiome datasets. Our method involves combining a smaller dataset with a larger dataset, and we found that using at least 25% of the target samples in the source data resulted in improved model performance. We determined random forest as our top model and employed feature selection to identify common and important taxa for disease prediction across the different studies. Our results suggest that this leveraging scheme is a promising approach for improving the accuracy and interpretability of machine learning models for predicting diseases based on microbiome data. MDPI 2023-02-08 /pmc/articles/PMC9952031/ /pubmed/36829725 http://dx.doi.org/10.3390/bioengineering10020231 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Song, Kuncheng
Zhou, Yi-Hui
Leveraging Scheme for Cross-Study Microbiome Machine Learning Prediction and Feature Evaluations
title Leveraging Scheme for Cross-Study Microbiome Machine Learning Prediction and Feature Evaluations
title_full Leveraging Scheme for Cross-Study Microbiome Machine Learning Prediction and Feature Evaluations
title_fullStr Leveraging Scheme for Cross-Study Microbiome Machine Learning Prediction and Feature Evaluations
title_full_unstemmed Leveraging Scheme for Cross-Study Microbiome Machine Learning Prediction and Feature Evaluations
title_short Leveraging Scheme for Cross-Study Microbiome Machine Learning Prediction and Feature Evaluations
title_sort leveraging scheme for cross-study microbiome machine learning prediction and feature evaluations
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9952031/
https://www.ncbi.nlm.nih.gov/pubmed/36829725
http://dx.doi.org/10.3390/bioengineering10020231
work_keys_str_mv AT songkuncheng leveragingschemeforcrossstudymicrobiomemachinelearningpredictionandfeatureevaluations
AT zhouyihui leveragingschemeforcrossstudymicrobiomemachinelearningpredictionandfeatureevaluations