Cargando…

An explainable machine learning framework for lung cancer hospital length of stay prediction

This work introduces a predictive Length of Stay (LOS) framework for lung cancer patients using machine learning (ML) models. The framework proposed to deal with imbalanced datasets for classification-based approaches using electronic healthcare records (EHR). We have utilized supervised ML methods...

Descripción completa

Detalles Bibliográficos
Autores principales: Alsinglawi, Belal, Alshari, Osama, Alorjani, Mohammed, Mubin, Omar, Alnajjar, Fady, Novoa, Mauricio, Darwish, Omar
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8755804/
https://www.ncbi.nlm.nih.gov/pubmed/35022512
http://dx.doi.org/10.1038/s41598-021-04608-7
_version_ 1784632449924333568
author Alsinglawi, Belal
Alshari, Osama
Alorjani, Mohammed
Mubin, Omar
Alnajjar, Fady
Novoa, Mauricio
Darwish, Omar
author_facet Alsinglawi, Belal
Alshari, Osama
Alorjani, Mohammed
Mubin, Omar
Alnajjar, Fady
Novoa, Mauricio
Darwish, Omar
author_sort Alsinglawi, Belal
collection PubMed
description This work introduces a predictive Length of Stay (LOS) framework for lung cancer patients using machine learning (ML) models. The framework proposed to deal with imbalanced datasets for classification-based approaches using electronic healthcare records (EHR). We have utilized supervised ML methods to predict lung cancer inpatients LOS during ICU hospitalization using the MIMIC-III dataset. Random Forest (RF) Model outperformed other models and achieved predicted results during the three framework phases. With clinical significance features selection, over-sampling methods (SMOTE and ADASYN) achieved the highest AUC results (98% with CI 95%: 95.3–100%, and 100% respectively). The combination of Over-sampling and under-sampling achieved the second-highest AUC results (98%, with CI 95%: 95.3–100%, and 97%, CI 95%: 93.7–100% SMOTE-Tomek, and SMOTE-ENN respectively). Under-sampling methods reported the least important AUC results (50%, with CI 95%: 40.2–59.8%) for both (ENN and Tomek- Links). Using ML explainable technique called SHAP, we explained the outcome of the predictive model (RF) with SMOTE class balancing technique to understand the most significant clinical features that contributed to predicting lung cancer LOS with the RF model. Our promising framework allows us to employ ML techniques in-hospital clinical information systems to predict lung cancer admissions into ICU.
format Online
Article
Text
id pubmed-8755804
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-87558042022-01-14 An explainable machine learning framework for lung cancer hospital length of stay prediction Alsinglawi, Belal Alshari, Osama Alorjani, Mohammed Mubin, Omar Alnajjar, Fady Novoa, Mauricio Darwish, Omar Sci Rep Article This work introduces a predictive Length of Stay (LOS) framework for lung cancer patients using machine learning (ML) models. The framework proposed to deal with imbalanced datasets for classification-based approaches using electronic healthcare records (EHR). We have utilized supervised ML methods to predict lung cancer inpatients LOS during ICU hospitalization using the MIMIC-III dataset. Random Forest (RF) Model outperformed other models and achieved predicted results during the three framework phases. With clinical significance features selection, over-sampling methods (SMOTE and ADASYN) achieved the highest AUC results (98% with CI 95%: 95.3–100%, and 100% respectively). The combination of Over-sampling and under-sampling achieved the second-highest AUC results (98%, with CI 95%: 95.3–100%, and 97%, CI 95%: 93.7–100% SMOTE-Tomek, and SMOTE-ENN respectively). Under-sampling methods reported the least important AUC results (50%, with CI 95%: 40.2–59.8%) for both (ENN and Tomek- Links). Using ML explainable technique called SHAP, we explained the outcome of the predictive model (RF) with SMOTE class balancing technique to understand the most significant clinical features that contributed to predicting lung cancer LOS with the RF model. Our promising framework allows us to employ ML techniques in-hospital clinical information systems to predict lung cancer admissions into ICU. Nature Publishing Group UK 2022-01-12 /pmc/articles/PMC8755804/ /pubmed/35022512 http://dx.doi.org/10.1038/s41598-021-04608-7 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
Alsinglawi, Belal
Alshari, Osama
Alorjani, Mohammed
Mubin, Omar
Alnajjar, Fady
Novoa, Mauricio
Darwish, Omar
An explainable machine learning framework for lung cancer hospital length of stay prediction
title An explainable machine learning framework for lung cancer hospital length of stay prediction
title_full An explainable machine learning framework for lung cancer hospital length of stay prediction
title_fullStr An explainable machine learning framework for lung cancer hospital length of stay prediction
title_full_unstemmed An explainable machine learning framework for lung cancer hospital length of stay prediction
title_short An explainable machine learning framework for lung cancer hospital length of stay prediction
title_sort explainable machine learning framework for lung cancer hospital length of stay prediction
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8755804/
https://www.ncbi.nlm.nih.gov/pubmed/35022512
http://dx.doi.org/10.1038/s41598-021-04608-7
work_keys_str_mv AT alsinglawibelal anexplainablemachinelearningframeworkforlungcancerhospitallengthofstayprediction
AT alshariosama anexplainablemachinelearningframeworkforlungcancerhospitallengthofstayprediction
AT alorjanimohammed anexplainablemachinelearningframeworkforlungcancerhospitallengthofstayprediction
AT mubinomar anexplainablemachinelearningframeworkforlungcancerhospitallengthofstayprediction
AT alnajjarfady anexplainablemachinelearningframeworkforlungcancerhospitallengthofstayprediction
AT novoamauricio anexplainablemachinelearningframeworkforlungcancerhospitallengthofstayprediction
AT darwishomar anexplainablemachinelearningframeworkforlungcancerhospitallengthofstayprediction
AT alsinglawibelal explainablemachinelearningframeworkforlungcancerhospitallengthofstayprediction
AT alshariosama explainablemachinelearningframeworkforlungcancerhospitallengthofstayprediction
AT alorjanimohammed explainablemachinelearningframeworkforlungcancerhospitallengthofstayprediction
AT mubinomar explainablemachinelearningframeworkforlungcancerhospitallengthofstayprediction
AT alnajjarfady explainablemachinelearningframeworkforlungcancerhospitallengthofstayprediction
AT novoamauricio explainablemachinelearningframeworkforlungcancerhospitallengthofstayprediction
AT darwishomar explainablemachinelearningframeworkforlungcancerhospitallengthofstayprediction