Cargando…
Data processing pipeline for cardiogenic shock prediction using machine learning
INTRODUCTION: Recent advances in machine learning provide new possibilities to process and analyse observational patient data to predict patient outcomes. In this paper, we introduce a data processing pipeline for cardiogenic shock (CS) prediction from the MIMIC III database of intensive cardiac car...
Autores principales: | , , , , , , , , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10077147/ https://www.ncbi.nlm.nih.gov/pubmed/37034352 http://dx.doi.org/10.3389/fcvm.2023.1132680 |
_version_ | 1785020261047730176 |
---|---|
author | Jajcay, Nikola Bezak, Branislav Segev, Amitai Matetzky, Shlomi Jankova, Jana Spartalis, Michael El Tahlawi, Mohammad Guerra, Federico Friebel, Julian Thevathasan, Tharusan Berta, Imrich Pölzl, Leo Nägele, Felix Pogran, Edita Cader, F. Aaysha Jarakovic, Milana Gollmann-Tepeköylü, Can Kollarova, Marta Petrikova, Katarina Tica, Otilia Krychtiuk, Konstantin A. Tavazzi, Guido Skurk, Carsten Huber, Kurt Böhm, Allan |
author_facet | Jajcay, Nikola Bezak, Branislav Segev, Amitai Matetzky, Shlomi Jankova, Jana Spartalis, Michael El Tahlawi, Mohammad Guerra, Federico Friebel, Julian Thevathasan, Tharusan Berta, Imrich Pölzl, Leo Nägele, Felix Pogran, Edita Cader, F. Aaysha Jarakovic, Milana Gollmann-Tepeköylü, Can Kollarova, Marta Petrikova, Katarina Tica, Otilia Krychtiuk, Konstantin A. Tavazzi, Guido Skurk, Carsten Huber, Kurt Böhm, Allan |
author_sort | Jajcay, Nikola |
collection | PubMed |
description | INTRODUCTION: Recent advances in machine learning provide new possibilities to process and analyse observational patient data to predict patient outcomes. In this paper, we introduce a data processing pipeline for cardiogenic shock (CS) prediction from the MIMIC III database of intensive cardiac care unit patients with acute coronary syndrome. The ability to identify high-risk patients could possibly allow taking pre-emptive measures and thus prevent the development of CS. METHODS: We mainly focus on techniques for the imputation of missing data by generating a pipeline for imputation and comparing the performance of various multivariate imputation algorithms, including k-nearest neighbours, two singular value decomposition (SVD)—based methods, and Multiple Imputation by Chained Equations. After imputation, we select the final subjects and variables from the imputed dataset and showcase the performance of the gradient-boosted framework that uses a tree-based classifier for cardiogenic shock prediction. RESULTS: We achieved good classification performance thanks to data cleaning and imputation (cross-validated mean area under the curve 0.805) without hyperparameter optimization. CONCLUSION: We believe our pre-processing pipeline would prove helpful also for other classification and regression experiments. |
format | Online Article Text |
id | pubmed-10077147 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-100771472023-04-07 Data processing pipeline for cardiogenic shock prediction using machine learning Jajcay, Nikola Bezak, Branislav Segev, Amitai Matetzky, Shlomi Jankova, Jana Spartalis, Michael El Tahlawi, Mohammad Guerra, Federico Friebel, Julian Thevathasan, Tharusan Berta, Imrich Pölzl, Leo Nägele, Felix Pogran, Edita Cader, F. Aaysha Jarakovic, Milana Gollmann-Tepeköylü, Can Kollarova, Marta Petrikova, Katarina Tica, Otilia Krychtiuk, Konstantin A. Tavazzi, Guido Skurk, Carsten Huber, Kurt Böhm, Allan Front Cardiovasc Med Cardiovascular Medicine INTRODUCTION: Recent advances in machine learning provide new possibilities to process and analyse observational patient data to predict patient outcomes. In this paper, we introduce a data processing pipeline for cardiogenic shock (CS) prediction from the MIMIC III database of intensive cardiac care unit patients with acute coronary syndrome. The ability to identify high-risk patients could possibly allow taking pre-emptive measures and thus prevent the development of CS. METHODS: We mainly focus on techniques for the imputation of missing data by generating a pipeline for imputation and comparing the performance of various multivariate imputation algorithms, including k-nearest neighbours, two singular value decomposition (SVD)—based methods, and Multiple Imputation by Chained Equations. After imputation, we select the final subjects and variables from the imputed dataset and showcase the performance of the gradient-boosted framework that uses a tree-based classifier for cardiogenic shock prediction. RESULTS: We achieved good classification performance thanks to data cleaning and imputation (cross-validated mean area under the curve 0.805) without hyperparameter optimization. CONCLUSION: We believe our pre-processing pipeline would prove helpful also for other classification and regression experiments. Frontiers Media S.A. 2023-03-23 /pmc/articles/PMC10077147/ /pubmed/37034352 http://dx.doi.org/10.3389/fcvm.2023.1132680 Text en © 2023 Jajcay, Bezak, Segev, Matetzky, Jankova, Spartalis, El Tahlawi, Guerra, Friebel, Thevathasan, Berta, Pölzl, Nägele, Pogran, Cader, Jarakovic, Gollmann-Tepeköylü, Kollarova, Petrikova, Tica, Krychtiuk, Tavazzi, Skurk, Huber and Böhm. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) (https://creativecommons.org/licenses/by/4.0/) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Cardiovascular Medicine Jajcay, Nikola Bezak, Branislav Segev, Amitai Matetzky, Shlomi Jankova, Jana Spartalis, Michael El Tahlawi, Mohammad Guerra, Federico Friebel, Julian Thevathasan, Tharusan Berta, Imrich Pölzl, Leo Nägele, Felix Pogran, Edita Cader, F. Aaysha Jarakovic, Milana Gollmann-Tepeköylü, Can Kollarova, Marta Petrikova, Katarina Tica, Otilia Krychtiuk, Konstantin A. Tavazzi, Guido Skurk, Carsten Huber, Kurt Böhm, Allan Data processing pipeline for cardiogenic shock prediction using machine learning |
title | Data processing pipeline for cardiogenic shock prediction using machine learning |
title_full | Data processing pipeline for cardiogenic shock prediction using machine learning |
title_fullStr | Data processing pipeline for cardiogenic shock prediction using machine learning |
title_full_unstemmed | Data processing pipeline for cardiogenic shock prediction using machine learning |
title_short | Data processing pipeline for cardiogenic shock prediction using machine learning |
title_sort | data processing pipeline for cardiogenic shock prediction using machine learning |
topic | Cardiovascular Medicine |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10077147/ https://www.ncbi.nlm.nih.gov/pubmed/37034352 http://dx.doi.org/10.3389/fcvm.2023.1132680 |
work_keys_str_mv | AT jajcaynikola dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning AT bezakbranislav dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning AT segevamitai dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning AT matetzkyshlomi dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning AT jankovajana dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning AT spartalismichael dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning AT eltahlawimohammad dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning AT guerrafederico dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning AT friebeljulian dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning AT thevathasantharusan dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning AT bertaimrich dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning AT polzlleo dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning AT nagelefelix dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning AT pogranedita dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning AT caderfaaysha dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning AT jarakovicmilana dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning AT gollmanntepekoylucan dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning AT kollarovamarta dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning AT petrikovakatarina dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning AT ticaotilia dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning AT krychtiukkonstantina dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning AT tavazziguido dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning AT skurkcarsten dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning AT huberkurt dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning AT bohmallan dataprocessingpipelineforcardiogenicshockpredictionusingmachinelearning |