Cargando…
Hybrid feature engineering of medical data via variational autoencoders with triplet loss: a COVID-19 prognosis study
Medical machine learning frameworks have received much attention in recent years. The recent COVID-19 pandemic was also accompanied by a surge in proposed machine learning algorithms for tasks such as diagnosis and mortality prognosis. Machine learning frameworks can be helpful medical assistants by...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9936112/ https://www.ncbi.nlm.nih.gov/pubmed/36808151 http://dx.doi.org/10.1038/s41598-023-29334-0 |
_version_ | 1784890166989553664 |
---|---|
author | Mahdavi, Mahdi Choubdar, Hadi Rostami, Zahra Niroomand, Behnaz Levine, Alexandra T. Fatemi, Alireza Bolhasani, Ehsan Vahabie, Abdol-Hossein Lomber, Stephen G. Merrikhi, Yaser |
author_facet | Mahdavi, Mahdi Choubdar, Hadi Rostami, Zahra Niroomand, Behnaz Levine, Alexandra T. Fatemi, Alireza Bolhasani, Ehsan Vahabie, Abdol-Hossein Lomber, Stephen G. Merrikhi, Yaser |
author_sort | Mahdavi, Mahdi |
collection | PubMed |
description | Medical machine learning frameworks have received much attention in recent years. The recent COVID-19 pandemic was also accompanied by a surge in proposed machine learning algorithms for tasks such as diagnosis and mortality prognosis. Machine learning frameworks can be helpful medical assistants by extracting data patterns that are otherwise hard to detect by humans. Efficient feature engineering and dimensionality reduction are major challenges in most medical machine learning frameworks. Autoencoders are novel unsupervised tools that can perform data-driven dimensionality reduction with minimum prior assumptions. This study, in a novel approach, investigated the predictive power of latent representations obtained from a hybrid autoencoder (HAE) framework combining variational autoencoder (VAE) characteristics with mean squared error (MSE) and triplet loss for forecasting COVID-19 patients with high mortality risk in a retrospective framework. Electronic laboratory and clinical data of 1474 patients were used in the study. Logistic regression with elastic net regularization (EN) and random forest (RF) models were used as final classifiers. Moreover, we also investigated the contribution of utilized features towards latent representations via mutual information analysis. HAE Latent representations model achieved decent performance with an area under ROC curve of 0.921 (±0.027) and 0.910 (±0.036) with EN and RF predictors, respectively, over the hold-out data in comparison with the raw (AUC EN: 0.913 (±0.022); RF: 0.903 (±0.020)) models. The study aims to provide an interpretable feature engineering framework for the medical environment with the potential to integrate imaging data for efficient feature engineering in rapid triage and other clinical predictive models. |
format | Online Article Text |
id | pubmed-9936112 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-99361122023-02-17 Hybrid feature engineering of medical data via variational autoencoders with triplet loss: a COVID-19 prognosis study Mahdavi, Mahdi Choubdar, Hadi Rostami, Zahra Niroomand, Behnaz Levine, Alexandra T. Fatemi, Alireza Bolhasani, Ehsan Vahabie, Abdol-Hossein Lomber, Stephen G. Merrikhi, Yaser Sci Rep Article Medical machine learning frameworks have received much attention in recent years. The recent COVID-19 pandemic was also accompanied by a surge in proposed machine learning algorithms for tasks such as diagnosis and mortality prognosis. Machine learning frameworks can be helpful medical assistants by extracting data patterns that are otherwise hard to detect by humans. Efficient feature engineering and dimensionality reduction are major challenges in most medical machine learning frameworks. Autoencoders are novel unsupervised tools that can perform data-driven dimensionality reduction with minimum prior assumptions. This study, in a novel approach, investigated the predictive power of latent representations obtained from a hybrid autoencoder (HAE) framework combining variational autoencoder (VAE) characteristics with mean squared error (MSE) and triplet loss for forecasting COVID-19 patients with high mortality risk in a retrospective framework. Electronic laboratory and clinical data of 1474 patients were used in the study. Logistic regression with elastic net regularization (EN) and random forest (RF) models were used as final classifiers. Moreover, we also investigated the contribution of utilized features towards latent representations via mutual information analysis. HAE Latent representations model achieved decent performance with an area under ROC curve of 0.921 (±0.027) and 0.910 (±0.036) with EN and RF predictors, respectively, over the hold-out data in comparison with the raw (AUC EN: 0.913 (±0.022); RF: 0.903 (±0.020)) models. The study aims to provide an interpretable feature engineering framework for the medical environment with the potential to integrate imaging data for efficient feature engineering in rapid triage and other clinical predictive models. Nature Publishing Group UK 2023-02-17 /pmc/articles/PMC9936112/ /pubmed/36808151 http://dx.doi.org/10.1038/s41598-023-29334-0 Text en © Crown 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Mahdavi, Mahdi Choubdar, Hadi Rostami, Zahra Niroomand, Behnaz Levine, Alexandra T. Fatemi, Alireza Bolhasani, Ehsan Vahabie, Abdol-Hossein Lomber, Stephen G. Merrikhi, Yaser Hybrid feature engineering of medical data via variational autoencoders with triplet loss: a COVID-19 prognosis study |
title | Hybrid feature engineering of medical data via variational autoencoders with triplet loss: a COVID-19 prognosis study |
title_full | Hybrid feature engineering of medical data via variational autoencoders with triplet loss: a COVID-19 prognosis study |
title_fullStr | Hybrid feature engineering of medical data via variational autoencoders with triplet loss: a COVID-19 prognosis study |
title_full_unstemmed | Hybrid feature engineering of medical data via variational autoencoders with triplet loss: a COVID-19 prognosis study |
title_short | Hybrid feature engineering of medical data via variational autoencoders with triplet loss: a COVID-19 prognosis study |
title_sort | hybrid feature engineering of medical data via variational autoencoders with triplet loss: a covid-19 prognosis study |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9936112/ https://www.ncbi.nlm.nih.gov/pubmed/36808151 http://dx.doi.org/10.1038/s41598-023-29334-0 |
work_keys_str_mv | AT mahdavimahdi hybridfeatureengineeringofmedicaldataviavariationalautoencoderswithtripletlossacovid19prognosisstudy AT choubdarhadi hybridfeatureengineeringofmedicaldataviavariationalautoencoderswithtripletlossacovid19prognosisstudy AT rostamizahra hybridfeatureengineeringofmedicaldataviavariationalautoencoderswithtripletlossacovid19prognosisstudy AT niroomandbehnaz hybridfeatureengineeringofmedicaldataviavariationalautoencoderswithtripletlossacovid19prognosisstudy AT levinealexandrat hybridfeatureengineeringofmedicaldataviavariationalautoencoderswithtripletlossacovid19prognosisstudy AT fatemialireza hybridfeatureengineeringofmedicaldataviavariationalautoencoderswithtripletlossacovid19prognosisstudy AT bolhasaniehsan hybridfeatureengineeringofmedicaldataviavariationalautoencoderswithtripletlossacovid19prognosisstudy AT vahabieabdolhossein hybridfeatureengineeringofmedicaldataviavariationalautoencoderswithtripletlossacovid19prognosisstudy AT lomberstepheng hybridfeatureengineeringofmedicaldataviavariationalautoencoderswithtripletlossacovid19prognosisstudy AT merrikhiyaser hybridfeatureengineeringofmedicaldataviavariationalautoencoderswithtripletlossacovid19prognosisstudy |