Cargando…

Hybrid feature engineering of medical data via variational autoencoders with triplet loss: a COVID-19 prognosis study

Medical machine learning frameworks have received much attention in recent years. The recent COVID-19 pandemic was also accompanied by a surge in proposed machine learning algorithms for tasks such as diagnosis and mortality prognosis. Machine learning frameworks can be helpful medical assistants by...

Descripción completa

Detalles Bibliográficos
Autores principales: Mahdavi, Mahdi, Choubdar, Hadi, Rostami, Zahra, Niroomand, Behnaz, Levine, Alexandra T., Fatemi, Alireza, Bolhasani, Ehsan, Vahabie, Abdol-Hossein, Lomber, Stephen G., Merrikhi, Yaser
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9936112/
https://www.ncbi.nlm.nih.gov/pubmed/36808151
http://dx.doi.org/10.1038/s41598-023-29334-0
_version_ 1784890166989553664
author Mahdavi, Mahdi
Choubdar, Hadi
Rostami, Zahra
Niroomand, Behnaz
Levine, Alexandra T.
Fatemi, Alireza
Bolhasani, Ehsan
Vahabie, Abdol-Hossein
Lomber, Stephen G.
Merrikhi, Yaser
author_facet Mahdavi, Mahdi
Choubdar, Hadi
Rostami, Zahra
Niroomand, Behnaz
Levine, Alexandra T.
Fatemi, Alireza
Bolhasani, Ehsan
Vahabie, Abdol-Hossein
Lomber, Stephen G.
Merrikhi, Yaser
author_sort Mahdavi, Mahdi
collection PubMed
description Medical machine learning frameworks have received much attention in recent years. The recent COVID-19 pandemic was also accompanied by a surge in proposed machine learning algorithms for tasks such as diagnosis and mortality prognosis. Machine learning frameworks can be helpful medical assistants by extracting data patterns that are otherwise hard to detect by humans. Efficient feature engineering and dimensionality reduction are major challenges in most medical machine learning frameworks. Autoencoders are novel unsupervised tools that can perform data-driven dimensionality reduction with minimum prior assumptions. This study, in a novel approach, investigated the predictive power of latent representations obtained from a hybrid autoencoder (HAE) framework combining variational autoencoder (VAE) characteristics with mean squared error (MSE) and triplet loss for forecasting COVID-19 patients with high mortality risk in a retrospective framework. Electronic laboratory and clinical data of 1474 patients were used in the study. Logistic regression with elastic net regularization (EN) and random forest (RF) models were used as final classifiers. Moreover, we also investigated the contribution of utilized features towards latent representations via mutual information analysis. HAE Latent representations model achieved decent performance with an area under ROC curve of 0.921 (±0.027) and 0.910 (±0.036) with EN and RF predictors, respectively, over the hold-out data in comparison with the raw (AUC EN: 0.913 (±0.022); RF: 0.903 (±0.020)) models. The study aims to provide an interpretable feature engineering framework for the medical environment with the potential to integrate imaging data for efficient feature engineering in rapid triage and other clinical predictive models.
format Online
Article
Text
id pubmed-9936112
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-99361122023-02-17 Hybrid feature engineering of medical data via variational autoencoders with triplet loss: a COVID-19 prognosis study Mahdavi, Mahdi Choubdar, Hadi Rostami, Zahra Niroomand, Behnaz Levine, Alexandra T. Fatemi, Alireza Bolhasani, Ehsan Vahabie, Abdol-Hossein Lomber, Stephen G. Merrikhi, Yaser Sci Rep Article Medical machine learning frameworks have received much attention in recent years. The recent COVID-19 pandemic was also accompanied by a surge in proposed machine learning algorithms for tasks such as diagnosis and mortality prognosis. Machine learning frameworks can be helpful medical assistants by extracting data patterns that are otherwise hard to detect by humans. Efficient feature engineering and dimensionality reduction are major challenges in most medical machine learning frameworks. Autoencoders are novel unsupervised tools that can perform data-driven dimensionality reduction with minimum prior assumptions. This study, in a novel approach, investigated the predictive power of latent representations obtained from a hybrid autoencoder (HAE) framework combining variational autoencoder (VAE) characteristics with mean squared error (MSE) and triplet loss for forecasting COVID-19 patients with high mortality risk in a retrospective framework. Electronic laboratory and clinical data of 1474 patients were used in the study. Logistic regression with elastic net regularization (EN) and random forest (RF) models were used as final classifiers. Moreover, we also investigated the contribution of utilized features towards latent representations via mutual information analysis. HAE Latent representations model achieved decent performance with an area under ROC curve of 0.921 (±0.027) and 0.910 (±0.036) with EN and RF predictors, respectively, over the hold-out data in comparison with the raw (AUC EN: 0.913 (±0.022); RF: 0.903 (±0.020)) models. The study aims to provide an interpretable feature engineering framework for the medical environment with the potential to integrate imaging data for efficient feature engineering in rapid triage and other clinical predictive models. Nature Publishing Group UK 2023-02-17 /pmc/articles/PMC9936112/ /pubmed/36808151 http://dx.doi.org/10.1038/s41598-023-29334-0 Text en © Crown 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Mahdavi, Mahdi
Choubdar, Hadi
Rostami, Zahra
Niroomand, Behnaz
Levine, Alexandra T.
Fatemi, Alireza
Bolhasani, Ehsan
Vahabie, Abdol-Hossein
Lomber, Stephen G.
Merrikhi, Yaser
Hybrid feature engineering of medical data via variational autoencoders with triplet loss: a COVID-19 prognosis study
title Hybrid feature engineering of medical data via variational autoencoders with triplet loss: a COVID-19 prognosis study
title_full Hybrid feature engineering of medical data via variational autoencoders with triplet loss: a COVID-19 prognosis study
title_fullStr Hybrid feature engineering of medical data via variational autoencoders with triplet loss: a COVID-19 prognosis study
title_full_unstemmed Hybrid feature engineering of medical data via variational autoencoders with triplet loss: a COVID-19 prognosis study
title_short Hybrid feature engineering of medical data via variational autoencoders with triplet loss: a COVID-19 prognosis study
title_sort hybrid feature engineering of medical data via variational autoencoders with triplet loss: a covid-19 prognosis study
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9936112/
https://www.ncbi.nlm.nih.gov/pubmed/36808151
http://dx.doi.org/10.1038/s41598-023-29334-0
work_keys_str_mv AT mahdavimahdi hybridfeatureengineeringofmedicaldataviavariationalautoencoderswithtripletlossacovid19prognosisstudy
AT choubdarhadi hybridfeatureengineeringofmedicaldataviavariationalautoencoderswithtripletlossacovid19prognosisstudy
AT rostamizahra hybridfeatureengineeringofmedicaldataviavariationalautoencoderswithtripletlossacovid19prognosisstudy
AT niroomandbehnaz hybridfeatureengineeringofmedicaldataviavariationalautoencoderswithtripletlossacovid19prognosisstudy
AT levinealexandrat hybridfeatureengineeringofmedicaldataviavariationalautoencoderswithtripletlossacovid19prognosisstudy
AT fatemialireza hybridfeatureengineeringofmedicaldataviavariationalautoencoderswithtripletlossacovid19prognosisstudy
AT bolhasaniehsan hybridfeatureengineeringofmedicaldataviavariationalautoencoderswithtripletlossacovid19prognosisstudy
AT vahabieabdolhossein hybridfeatureengineeringofmedicaldataviavariationalautoencoderswithtripletlossacovid19prognosisstudy
AT lomberstepheng hybridfeatureengineeringofmedicaldataviavariationalautoencoderswithtripletlossacovid19prognosisstudy
AT merrikhiyaser hybridfeatureengineeringofmedicaldataviavariationalautoencoderswithtripletlossacovid19prognosisstudy