Cargando…

Data processing pipeline for cardiogenic shock prediction using machine learning

INTRODUCTION: Recent advances in machine learning provide new possibilities to process and analyse observational patient data to predict patient outcomes. In this paper, we introduce a data processing pipeline for cardiogenic shock (CS) prediction from the MIMIC III database of intensive cardiac car...

Descripción completa

Detalles Bibliográficos
Autores principales: Jajcay, Nikola, Bezak, Branislav, Segev, Amitai, Matetzky, Shlomi, Jankova, Jana, Spartalis, Michael, El Tahlawi, Mohammad, Guerra, Federico, Friebel, Julian, Thevathasan, Tharusan, Berta, Imrich, Pölzl, Leo, Nägele, Felix, Pogran, Edita, Cader, F. Aaysha, Jarakovic, Milana, Gollmann-Tepeköylü, Can, Kollarova, Marta, Petrikova, Katarina, Tica, Otilia, Krychtiuk, Konstantin A., Tavazzi, Guido, Skurk, Carsten, Huber, Kurt, Böhm, Allan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10077147/
https://www.ncbi.nlm.nih.gov/pubmed/37034352
http://dx.doi.org/10.3389/fcvm.2023.1132680
_version_ 1785020261047730176
author Jajcay, Nikola
Bezak, Branislav
Segev, Amitai
Matetzky, Shlomi
Jankova, Jana
Spartalis, Michael
El Tahlawi, Mohammad
Guerra, Federico
Friebel, Julian
Thevathasan, Tharusan
Berta, Imrich
Pölzl, Leo
Nägele, Felix
Pogran, Edita
Cader, F. Aaysha
Jarakovic, Milana
Gollmann-Tepeköylü, Can
Kollarova, Marta
Petrikova, Katarina
Tica, Otilia
Krychtiuk, Konstantin A.
Tavazzi, Guido
Skurk, Carsten
Huber, Kurt
Böhm, Allan
author_facet Jajcay, Nikola
Bezak, Branislav
Segev, Amitai
Matetzky, Shlomi
Jankova, Jana
Spartalis, Michael
El Tahlawi, Mohammad
Guerra, Federico
Friebel, Julian
Thevathasan, Tharusan
Berta, Imrich
Pölzl, Leo
Nägele, Felix
Pogran, Edita
Cader, F. Aaysha
Jarakovic, Milana
Gollmann-Tepeköylü, Can
Kollarova, Marta
Petrikova, Katarina
Tica, Otilia
Krychtiuk, Konstantin A.
Tavazzi, Guido
Skurk, Carsten
Huber, Kurt
Böhm, Allan
author_sort Jajcay, Nikola
collection PubMed
description INTRODUCTION: Recent advances in machine learning provide new possibilities to process and analyse observational patient data to predict patient outcomes. In this paper, we introduce a data processing pipeline for cardiogenic shock (CS) prediction from the MIMIC III database of intensive cardiac care unit patients with acute coronary syndrome. The ability to identify high-risk patients could possibly allow taking pre-emptive measures and thus prevent the development of CS. METHODS: We mainly focus on techniques for the imputation of missing data by generating a pipeline for imputation and comparing the performance of various multivariate imputation algorithms, including k-nearest neighbours, two singular value decomposition (SVD)—based methods, and Multiple Imputation by Chained Equations. After imputation, we select the final subjects and variables from the imputed dataset and showcase the performance of the gradient-boosted framework that uses a tree-based classifier for cardiogenic shock prediction. RESULTS: We achieved good classification performance thanks to data cleaning and imputation (cross-validated mean area under the curve 0.805) without hyperparameter optimization. CONCLUSION: We believe our pre-processing pipeline would prove helpful also for other classification and regression experiments.
format Online
Article
Text
id pubmed-10077147
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-100771472023-04-07 Data processing pipeline for cardiogenic shock prediction using machine learning Jajcay, Nikola Bezak, Branislav Segev, Amitai Matetzky, Shlomi Jankova, Jana Spartalis, Michael El Tahlawi, Mohammad Guerra, Federico Friebel, Julian Thevathasan, Tharusan Berta, Imrich Pölzl, Leo Nägele, Felix Pogran, Edita Cader, F. Aaysha Jarakovic, Milana Gollmann-Tepeköylü, Can Kollarova, Marta Petrikova, Katarina Tica, Otilia Krychtiuk, Konstantin A. Tavazzi, Guido Skurk, Carsten Huber, Kurt Böhm, Allan Front Cardiovasc Med Cardiovascular Medicine INTRODUCTION: Recent advances in machine learning provide new possibilities to process and analyse observational patient data to predict patient outcomes. In this paper, we introduce a data processing pipeline for cardiogenic shock (CS) prediction from the MIMIC III database of intensive cardiac care unit patients with acute coronary syndrome. The ability to identify high-risk patients could possibly allow taking pre-emptive measures and thus prevent the development of CS. METHODS: We mainly focus on techniques for the imputation of missing data by generating a pipeline for imputation and comparing the performance of various multivariate imputation algorithms, including k-nearest neighbours, two singular value decomposition (SVD)—based methods, and Multiple Imputation by Chained Equations. After imputation, we select the final subjects and variables from the imputed dataset and showcase the performance of the gradient-boosted framework that uses a tree-based classifier for cardiogenic shock prediction. RESULTS: We achieved good classification performance thanks to data cleaning and imputation (cross-validated mean area under the curve 0.805) without hyperparameter optimization. CONCLUSION: We believe our pre-processing pipeline would prove helpful also for other classification and regression experiments. Frontiers Media S.A. 2023-03-23 /pmc/articles/PMC10077147/ /pubmed/37034352 http://dx.doi.org/10.3389/fcvm.2023.1132680 Text en © 2023 Jajcay, Bezak, Segev, Matetzky, Jankova, Spartalis, El Tahlawi, Guerra, Friebel, Thevathasan, Berta, Pölzl, Nägele, Pogran, Cader, Jarakovic, Gollmann-Tepeköylü, Kollarova, Petrikova, Tica, Krychtiuk, Tavazzi, Skurk, Huber and Böhm. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) (https://creativecommons.org/licenses/by/4.0/) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Cardiovascular Medicine
Jajcay, Nikola
Bezak, Branislav
Segev, Amitai
Matetzky, Shlomi
Jankova, Jana
Spartalis, Michael
El Tahlawi, Mohammad
Guerra, Federico
Friebel, Julian
Thevathasan, Tharusan
Berta, Imrich
Pölzl, Leo
Nägele, Felix
Pogran, Edita
Cader, F. Aaysha
Jarakovic, Milana
Gollmann-Tepeköylü, Can
Kollarova, Marta
Petrikova, Katarina
Tica, Otilia
Krychtiuk, Konstantin A.
Tavazzi, Guido
Skurk, Carsten
Huber, Kurt
Böhm, Allan
Data processing pipeline for cardiogenic shock prediction using machine learning
title Data processing pipeline for cardiogenic shock prediction using machine learning
title_full Data processing pipeline for cardiogenic shock prediction using machine learning
title_fullStr Data processing pipeline for cardiogenic shock prediction using machine learning
title_full_unstemmed Data processing pipeline for cardiogenic shock prediction using machine learning
title_short Data processing pipeline for cardiogenic shock prediction using machine learning
title_sort data processing pipeline for cardiogenic shock prediction using machine learning
topic Cardiovascular Medicine
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10077147/
https://www.ncbi.nlm.nih.gov/pubmed/37034352
http://dx.doi.org/10.3389/fcvm.2023.1132680
work_keys_str_mv AT jajcaynikola dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning
AT bezakbranislav dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning
AT segevamitai dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning
AT matetzkyshlomi dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning
AT jankovajana dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning
AT spartalismichael dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning
AT eltahlawimohammad dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning
AT guerrafederico dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning
AT friebeljulian dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning
AT thevathasantharusan dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning
AT bertaimrich dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning
AT polzlleo dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning
AT nagelefelix dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning
AT pogranedita dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning
AT caderfaaysha dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning
AT jarakovicmilana dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning
AT gollmanntepekoylucan dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning
AT kollarovamarta dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning
AT petrikovakatarina dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning
AT ticaotilia dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning
AT krychtiukkonstantina dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning
AT tavazziguido dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning
AT skurkcarsten dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning
AT huberkurt dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning
AT bohmallan dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning