Cargando…

Comparison of predictive modeling approaches for 30-day all-cause non-elective readmission risk

BACKGROUND: This paper explores the importance of electronic medical records (EMR) for predicting 30-day all-cause non-elective readmission risk of patients and presents a comparison of prediction performance of commonly used methods. METHODS: The data are extracted from eight Advocate Health Care h...

Descripción completa

Detalles Bibliográficos
Autores principales: Tong, Liping, Erdmann, Cole, Daldalian, Marina, Li, Jing, Esposito, Tina
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4769572/
https://www.ncbi.nlm.nih.gov/pubmed/26920363
http://dx.doi.org/10.1186/s12874-016-0128-0
_version_ 1782418127364030464
author Tong, Liping
Erdmann, Cole
Daldalian, Marina
Li, Jing
Esposito, Tina
author_facet Tong, Liping
Erdmann, Cole
Daldalian, Marina
Li, Jing
Esposito, Tina
author_sort Tong, Liping
collection PubMed
description BACKGROUND: This paper explores the importance of electronic medical records (EMR) for predicting 30-day all-cause non-elective readmission risk of patients and presents a comparison of prediction performance of commonly used methods. METHODS: The data are extracted from eight Advocate Health Care hospitals. Index admissions are excluded from the cohort if they are observation, inpatient admissions for psychiatry, skilled nursing, hospice, rehabilitation, maternal and newborn visits, or if the patient expires during the index admission. Data are randomly and repeatedly divided into fitting and validating sets for cross validations. Approaches including LACE, STEPWISE logistic, LASSO logistic, and AdaBoost, are compared with sample sizes varying from 2,500 to 80,000. RESULTS: Our results confirm that LACE has moderate discrimination power with the area under receiver operating characteristic curve (AUC) around 0.65-0.66, which can be improved to 0.73-0.74 when additional variables from EMR are considered. These variables include Inpatient in the last six months, Number of emergency room visits or inpatients in the last year, Braden score, Polypharmacy, Employment status, Discharge disposition, Albumin level, and medical condition variables such as Leukemia, Malignancy, Renal failure with hemodialysis, History of alcohol substance abuse, Dementia and Trauma. When sample size is small (≤5000), LASSO is the best; when sample size is large (≥20,000), the predictive performance is similar. The STEPWISE method has a slightly lower AUC (0.734) comparing to LASSO (0.737) and AdaBoost (0.737). More than one half of the selected predictors can be false positives when using a single method and a single division of fitting/validating data. CONCLUSIONS: True predictors can be identified by repeatedly dividing data into fitting/validating subsets and referring the final model based on summarizing results. LASSO is a better alternative to the STEPWISE logistic regression, especially when sample size is not large. The evidence for adequate sample size can be explored by fitting models on gradually reduced samples. Our model comparison strategy is not only good for 30-day all-cause non-elective readmission risk predictions, but also applicable to other types of predictive models in clinical studies.
format Online
Article
Text
id pubmed-4769572
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-47695722016-02-28 Comparison of predictive modeling approaches for 30-day all-cause non-elective readmission risk Tong, Liping Erdmann, Cole Daldalian, Marina Li, Jing Esposito, Tina BMC Med Res Methodol Research Article BACKGROUND: This paper explores the importance of electronic medical records (EMR) for predicting 30-day all-cause non-elective readmission risk of patients and presents a comparison of prediction performance of commonly used methods. METHODS: The data are extracted from eight Advocate Health Care hospitals. Index admissions are excluded from the cohort if they are observation, inpatient admissions for psychiatry, skilled nursing, hospice, rehabilitation, maternal and newborn visits, or if the patient expires during the index admission. Data are randomly and repeatedly divided into fitting and validating sets for cross validations. Approaches including LACE, STEPWISE logistic, LASSO logistic, and AdaBoost, are compared with sample sizes varying from 2,500 to 80,000. RESULTS: Our results confirm that LACE has moderate discrimination power with the area under receiver operating characteristic curve (AUC) around 0.65-0.66, which can be improved to 0.73-0.74 when additional variables from EMR are considered. These variables include Inpatient in the last six months, Number of emergency room visits or inpatients in the last year, Braden score, Polypharmacy, Employment status, Discharge disposition, Albumin level, and medical condition variables such as Leukemia, Malignancy, Renal failure with hemodialysis, History of alcohol substance abuse, Dementia and Trauma. When sample size is small (≤5000), LASSO is the best; when sample size is large (≥20,000), the predictive performance is similar. The STEPWISE method has a slightly lower AUC (0.734) comparing to LASSO (0.737) and AdaBoost (0.737). More than one half of the selected predictors can be false positives when using a single method and a single division of fitting/validating data. CONCLUSIONS: True predictors can be identified by repeatedly dividing data into fitting/validating subsets and referring the final model based on summarizing results. LASSO is a better alternative to the STEPWISE logistic regression, especially when sample size is not large. The evidence for adequate sample size can be explored by fitting models on gradually reduced samples. Our model comparison strategy is not only good for 30-day all-cause non-elective readmission risk predictions, but also applicable to other types of predictive models in clinical studies. BioMed Central 2016-02-27 /pmc/articles/PMC4769572/ /pubmed/26920363 http://dx.doi.org/10.1186/s12874-016-0128-0 Text en © Tong et al. 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Tong, Liping
Erdmann, Cole
Daldalian, Marina
Li, Jing
Esposito, Tina
Comparison of predictive modeling approaches for 30-day all-cause non-elective readmission risk
title Comparison of predictive modeling approaches for 30-day all-cause non-elective readmission risk
title_full Comparison of predictive modeling approaches for 30-day all-cause non-elective readmission risk
title_fullStr Comparison of predictive modeling approaches for 30-day all-cause non-elective readmission risk
title_full_unstemmed Comparison of predictive modeling approaches for 30-day all-cause non-elective readmission risk
title_short Comparison of predictive modeling approaches for 30-day all-cause non-elective readmission risk
title_sort comparison of predictive modeling approaches for 30-day all-cause non-elective readmission risk
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4769572/
https://www.ncbi.nlm.nih.gov/pubmed/26920363
http://dx.doi.org/10.1186/s12874-016-0128-0
work_keys_str_mv AT tongliping comparisonofpredictivemodelingapproachesfor30dayallcausenonelectivereadmissionrisk
AT erdmanncole comparisonofpredictivemodelingapproachesfor30dayallcausenonelectivereadmissionrisk
AT daldalianmarina comparisonofpredictivemodelingapproachesfor30dayallcausenonelectivereadmissionrisk
AT lijing comparisonofpredictivemodelingapproachesfor30dayallcausenonelectivereadmissionrisk
AT espositotina comparisonofpredictivemodelingapproachesfor30dayallcausenonelectivereadmissionrisk