Cargando…

DeepMicroGen: a generative adversarial network-based method for longitudinal microbiome data imputation

MOTIVATION: The human microbiome, which is linked to various diseases by growing evidence, has a profound impact on human health. Since changes in the composition of the microbiome across time are associated with disease and clinical outcomes, microbiome analysis should be performed in a longitudina...

Descripción completa

Detalles Bibliográficos
Autores principales: Choi, Joung Min, Ji, Ming, Watson, Layne T, Zhang, Liqing
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10196688/
https://www.ncbi.nlm.nih.gov/pubmed/37099704
http://dx.doi.org/10.1093/bioinformatics/btad286
_version_ 1785044400898834432
author Choi, Joung Min
Ji, Ming
Watson, Layne T
Zhang, Liqing
author_facet Choi, Joung Min
Ji, Ming
Watson, Layne T
Zhang, Liqing
author_sort Choi, Joung Min
collection PubMed
description MOTIVATION: The human microbiome, which is linked to various diseases by growing evidence, has a profound impact on human health. Since changes in the composition of the microbiome across time are associated with disease and clinical outcomes, microbiome analysis should be performed in a longitudinal study. However, due to limited sample sizes and differing numbers of timepoints for different subjects, a significant amount of data cannot be utilized, directly affecting the quality of analysis results. Deep generative models have been proposed to address this lack of data issue. Specifically, a generative adversarial network (GAN) has been successfully utilized for data augmentation to improve prediction tasks. Recent studies have also shown improved performance of GAN-based models for missing value imputation in a multivariate time series dataset compared with traditional imputation methods. RESULTS: This work proposes DeepMicroGen, a bidirectional recurrent neural network-based GAN model, trained on the temporal relationship between the observations, to impute the missing microbiome samples in longitudinal studies. DeepMicroGen outperforms standard baseline imputation methods, showing the lowest mean absolute error for both simulated and real datasets. Finally, the proposed model improved the predicted clinical outcome for allergies, by providing imputation for an incomplete longitudinal dataset used to train the classifier. AVAILABILITY AND IMPLEMENTATION: DeepMicroGen is publicly available at https://github.com/joungmin-choi/DeepMicroGen.
format Online
Article
Text
id pubmed-10196688
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-101966882023-05-20 DeepMicroGen: a generative adversarial network-based method for longitudinal microbiome data imputation Choi, Joung Min Ji, Ming Watson, Layne T Zhang, Liqing Bioinformatics Original Paper MOTIVATION: The human microbiome, which is linked to various diseases by growing evidence, has a profound impact on human health. Since changes in the composition of the microbiome across time are associated with disease and clinical outcomes, microbiome analysis should be performed in a longitudinal study. However, due to limited sample sizes and differing numbers of timepoints for different subjects, a significant amount of data cannot be utilized, directly affecting the quality of analysis results. Deep generative models have been proposed to address this lack of data issue. Specifically, a generative adversarial network (GAN) has been successfully utilized for data augmentation to improve prediction tasks. Recent studies have also shown improved performance of GAN-based models for missing value imputation in a multivariate time series dataset compared with traditional imputation methods. RESULTS: This work proposes DeepMicroGen, a bidirectional recurrent neural network-based GAN model, trained on the temporal relationship between the observations, to impute the missing microbiome samples in longitudinal studies. DeepMicroGen outperforms standard baseline imputation methods, showing the lowest mean absolute error for both simulated and real datasets. Finally, the proposed model improved the predicted clinical outcome for allergies, by providing imputation for an incomplete longitudinal dataset used to train the classifier. AVAILABILITY AND IMPLEMENTATION: DeepMicroGen is publicly available at https://github.com/joungmin-choi/DeepMicroGen. Oxford University Press 2023-04-26 /pmc/articles/PMC10196688/ /pubmed/37099704 http://dx.doi.org/10.1093/bioinformatics/btad286 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Paper
Choi, Joung Min
Ji, Ming
Watson, Layne T
Zhang, Liqing
DeepMicroGen: a generative adversarial network-based method for longitudinal microbiome data imputation
title DeepMicroGen: a generative adversarial network-based method for longitudinal microbiome data imputation
title_full DeepMicroGen: a generative adversarial network-based method for longitudinal microbiome data imputation
title_fullStr DeepMicroGen: a generative adversarial network-based method for longitudinal microbiome data imputation
title_full_unstemmed DeepMicroGen: a generative adversarial network-based method for longitudinal microbiome data imputation
title_short DeepMicroGen: a generative adversarial network-based method for longitudinal microbiome data imputation
title_sort deepmicrogen: a generative adversarial network-based method for longitudinal microbiome data imputation
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10196688/
https://www.ncbi.nlm.nih.gov/pubmed/37099704
http://dx.doi.org/10.1093/bioinformatics/btad286
work_keys_str_mv AT choijoungmin deepmicrogenagenerativeadversarialnetworkbasedmethodforlongitudinalmicrobiomedataimputation
AT jiming deepmicrogenagenerativeadversarialnetworkbasedmethodforlongitudinalmicrobiomedataimputation
AT watsonlaynet deepmicrogenagenerativeadversarialnetworkbasedmethodforlongitudinalmicrobiomedataimputation
AT zhangliqing deepmicrogenagenerativeadversarialnetworkbasedmethodforlongitudinalmicrobiomedataimputation