Cargando…
Bayesian Inference-Based Gaussian Mixture Models With Optimal Components Estimation Towards Large-Scale Synthetic Data Generation for In Silico Clinical Trials
Goal: To develop a computationally efficient and unbiased synthetic data generator for large-scale in silico clinical trials (CTs). Methods: We propose the BGMM-OCE, an extension of the conventional BGMM (Bayesian Gaussian Mixture Models) algorithm to provide unbiased estimations regarding the optim...
Formato: | Online Artículo Texto |
---|---|
Lenguaje: | English |
Publicado: |
IEEE
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9970043/ https://www.ncbi.nlm.nih.gov/pubmed/36860496 http://dx.doi.org/10.1109/OJEMB.2022.3181796 |
_version_ | 1784897838706065408 |
---|---|
collection | PubMed |
description | Goal: To develop a computationally efficient and unbiased synthetic data generator for large-scale in silico clinical trials (CTs). Methods: We propose the BGMM-OCE, an extension of the conventional BGMM (Bayesian Gaussian Mixture Models) algorithm to provide unbiased estimations regarding the optimal number of Gaussian components and yield high-quality, large-scale synthetic data at reduced computational complexity. Spectral clustering with efficient eigenvalue decomposition is applied to estimate the hyperparameters of the generator. A case study is conducted to compare the performance of BGMM-OCE against four straightforward synthetic data generators for in silico CTs in hypertrophic cardiomyopathy (HCM). Results: The BGMM-OCE generated 30000 virtual patient profiles having the lowest coefficient-of-variation (0.046), inter- and intra-correlation differences (0.017, and 0.016, respectively) with the real ones in reduced execution time. Conclusions: BGMM-OCE overcomes the lack of population size in HCM which obscures the development of targeted therapies and robust risk stratification models. |
format | Online Article Text |
id | pubmed-9970043 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | IEEE |
record_format | MEDLINE/PubMed |
spelling | pubmed-99700432023-02-28 Bayesian Inference-Based Gaussian Mixture Models With Optimal Components Estimation Towards Large-Scale Synthetic Data Generation for In Silico Clinical Trials IEEE Open J Eng Med Biol Article Goal: To develop a computationally efficient and unbiased synthetic data generator for large-scale in silico clinical trials (CTs). Methods: We propose the BGMM-OCE, an extension of the conventional BGMM (Bayesian Gaussian Mixture Models) algorithm to provide unbiased estimations regarding the optimal number of Gaussian components and yield high-quality, large-scale synthetic data at reduced computational complexity. Spectral clustering with efficient eigenvalue decomposition is applied to estimate the hyperparameters of the generator. A case study is conducted to compare the performance of BGMM-OCE against four straightforward synthetic data generators for in silico CTs in hypertrophic cardiomyopathy (HCM). Results: The BGMM-OCE generated 30000 virtual patient profiles having the lowest coefficient-of-variation (0.046), inter- and intra-correlation differences (0.017, and 0.016, respectively) with the real ones in reduced execution time. Conclusions: BGMM-OCE overcomes the lack of population size in HCM which obscures the development of targeted therapies and robust risk stratification models. IEEE 2022-06-10 /pmc/articles/PMC9970043/ /pubmed/36860496 http://dx.doi.org/10.1109/OJEMB.2022.3181796 Text en https://creativecommons.org/licenses/by/4.0/This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/ |
spellingShingle | Article Bayesian Inference-Based Gaussian Mixture Models With Optimal Components Estimation Towards Large-Scale Synthetic Data Generation for In Silico Clinical Trials |
title | Bayesian Inference-Based Gaussian Mixture Models With Optimal Components Estimation Towards Large-Scale Synthetic Data Generation for In Silico Clinical Trials |
title_full | Bayesian Inference-Based Gaussian Mixture Models With Optimal Components Estimation Towards Large-Scale Synthetic Data Generation for In Silico Clinical Trials |
title_fullStr | Bayesian Inference-Based Gaussian Mixture Models With Optimal Components Estimation Towards Large-Scale Synthetic Data Generation for In Silico Clinical Trials |
title_full_unstemmed | Bayesian Inference-Based Gaussian Mixture Models With Optimal Components Estimation Towards Large-Scale Synthetic Data Generation for In Silico Clinical Trials |
title_short | Bayesian Inference-Based Gaussian Mixture Models With Optimal Components Estimation Towards Large-Scale Synthetic Data Generation for In Silico Clinical Trials |
title_sort | bayesian inference-based gaussian mixture models with optimal components estimation towards large-scale synthetic data generation for in silico clinical trials |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9970043/ https://www.ncbi.nlm.nih.gov/pubmed/36860496 http://dx.doi.org/10.1109/OJEMB.2022.3181796 |
work_keys_str_mv | AT bayesianinferencebasedgaussianmixturemodelswithoptimalcomponentsestimationtowardslargescalesyntheticdatagenerationforinsilicoclinicaltrials AT bayesianinferencebasedgaussianmixturemodelswithoptimalcomponentsestimationtowardslargescalesyntheticdatagenerationforinsilicoclinicaltrials AT bayesianinferencebasedgaussianmixturemodelswithoptimalcomponentsestimationtowardslargescalesyntheticdatagenerationforinsilicoclinicaltrials AT bayesianinferencebasedgaussianmixturemodelswithoptimalcomponentsestimationtowardslargescalesyntheticdatagenerationforinsilicoclinicaltrials AT bayesianinferencebasedgaussianmixturemodelswithoptimalcomponentsestimationtowardslargescalesyntheticdatagenerationforinsilicoclinicaltrials AT bayesianinferencebasedgaussianmixturemodelswithoptimalcomponentsestimationtowardslargescalesyntheticdatagenerationforinsilicoclinicaltrials |