Cargando…
An explainable machine learning framework for lung cancer hospital length of stay prediction
This work introduces a predictive Length of Stay (LOS) framework for lung cancer patients using machine learning (ML) models. The framework proposed to deal with imbalanced datasets for classification-based approaches using electronic healthcare records (EHR). We have utilized supervised ML methods...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8755804/ https://www.ncbi.nlm.nih.gov/pubmed/35022512 http://dx.doi.org/10.1038/s41598-021-04608-7 |
_version_ | 1784632449924333568 |
---|---|
author | Alsinglawi, Belal Alshari, Osama Alorjani, Mohammed Mubin, Omar Alnajjar, Fady Novoa, Mauricio Darwish, Omar |
author_facet | Alsinglawi, Belal Alshari, Osama Alorjani, Mohammed Mubin, Omar Alnajjar, Fady Novoa, Mauricio Darwish, Omar |
author_sort | Alsinglawi, Belal |
collection | PubMed |
description | This work introduces a predictive Length of Stay (LOS) framework for lung cancer patients using machine learning (ML) models. The framework proposed to deal with imbalanced datasets for classification-based approaches using electronic healthcare records (EHR). We have utilized supervised ML methods to predict lung cancer inpatients LOS during ICU hospitalization using the MIMIC-III dataset. Random Forest (RF) Model outperformed other models and achieved predicted results during the three framework phases. With clinical significance features selection, over-sampling methods (SMOTE and ADASYN) achieved the highest AUC results (98% with CI 95%: 95.3–100%, and 100% respectively). The combination of Over-sampling and under-sampling achieved the second-highest AUC results (98%, with CI 95%: 95.3–100%, and 97%, CI 95%: 93.7–100% SMOTE-Tomek, and SMOTE-ENN respectively). Under-sampling methods reported the least important AUC results (50%, with CI 95%: 40.2–59.8%) for both (ENN and Tomek- Links). Using ML explainable technique called SHAP, we explained the outcome of the predictive model (RF) with SMOTE class balancing technique to understand the most significant clinical features that contributed to predicting lung cancer LOS with the RF model. Our promising framework allows us to employ ML techniques in-hospital clinical information systems to predict lung cancer admissions into ICU. |
format | Online Article Text |
id | pubmed-8755804 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-87558042022-01-14 An explainable machine learning framework for lung cancer hospital length of stay prediction Alsinglawi, Belal Alshari, Osama Alorjani, Mohammed Mubin, Omar Alnajjar, Fady Novoa, Mauricio Darwish, Omar Sci Rep Article This work introduces a predictive Length of Stay (LOS) framework for lung cancer patients using machine learning (ML) models. The framework proposed to deal with imbalanced datasets for classification-based approaches using electronic healthcare records (EHR). We have utilized supervised ML methods to predict lung cancer inpatients LOS during ICU hospitalization using the MIMIC-III dataset. Random Forest (RF) Model outperformed other models and achieved predicted results during the three framework phases. With clinical significance features selection, over-sampling methods (SMOTE and ADASYN) achieved the highest AUC results (98% with CI 95%: 95.3–100%, and 100% respectively). The combination of Over-sampling and under-sampling achieved the second-highest AUC results (98%, with CI 95%: 95.3–100%, and 97%, CI 95%: 93.7–100% SMOTE-Tomek, and SMOTE-ENN respectively). Under-sampling methods reported the least important AUC results (50%, with CI 95%: 40.2–59.8%) for both (ENN and Tomek- Links). Using ML explainable technique called SHAP, we explained the outcome of the predictive model (RF) with SMOTE class balancing technique to understand the most significant clinical features that contributed to predicting lung cancer LOS with the RF model. Our promising framework allows us to employ ML techniques in-hospital clinical information systems to predict lung cancer admissions into ICU. Nature Publishing Group UK 2022-01-12 /pmc/articles/PMC8755804/ /pubmed/35022512 http://dx.doi.org/10.1038/s41598-021-04608-7 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Alsinglawi, Belal Alshari, Osama Alorjani, Mohammed Mubin, Omar Alnajjar, Fady Novoa, Mauricio Darwish, Omar An explainable machine learning framework for lung cancer hospital length of stay prediction |
title | An explainable machine learning framework for lung cancer hospital length of stay prediction |
title_full | An explainable machine learning framework for lung cancer hospital length of stay prediction |
title_fullStr | An explainable machine learning framework for lung cancer hospital length of stay prediction |
title_full_unstemmed | An explainable machine learning framework for lung cancer hospital length of stay prediction |
title_short | An explainable machine learning framework for lung cancer hospital length of stay prediction |
title_sort | explainable machine learning framework for lung cancer hospital length of stay prediction |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8755804/ https://www.ncbi.nlm.nih.gov/pubmed/35022512 http://dx.doi.org/10.1038/s41598-021-04608-7 |
work_keys_str_mv | AT alsinglawibelal anexplainablemachinelearningframeworkforlungcancerhospitallengthofstayprediction AT alshariosama anexplainablemachinelearningframeworkforlungcancerhospitallengthofstayprediction AT alorjanimohammed anexplainablemachinelearningframeworkforlungcancerhospitallengthofstayprediction AT mubinomar anexplainablemachinelearningframeworkforlungcancerhospitallengthofstayprediction AT alnajjarfady anexplainablemachinelearningframeworkforlungcancerhospitallengthofstayprediction AT novoamauricio anexplainablemachinelearningframeworkforlungcancerhospitallengthofstayprediction AT darwishomar anexplainablemachinelearningframeworkforlungcancerhospitallengthofstayprediction AT alsinglawibelal explainablemachinelearningframeworkforlungcancerhospitallengthofstayprediction AT alshariosama explainablemachinelearningframeworkforlungcancerhospitallengthofstayprediction AT alorjanimohammed explainablemachinelearningframeworkforlungcancerhospitallengthofstayprediction AT mubinomar explainablemachinelearningframeworkforlungcancerhospitallengthofstayprediction AT alnajjarfady explainablemachinelearningframeworkforlungcancerhospitallengthofstayprediction AT novoamauricio explainablemachinelearningframeworkforlungcancerhospitallengthofstayprediction AT darwishomar explainablemachinelearningframeworkforlungcancerhospitallengthofstayprediction |