Cargando…
Utilizing imbalanced electronic health records to predict acute kidney injury by ensemble learning and time series model
BACKGROUND: Acute Kidney Injury (AKI) is a shared complication among Intensive Care Unit (ICU), marked by high cost, high morbidity and high mortality. As the early prediction of AKI is critical for patients’ outcomes and data mining is such a powerful prediction tool, many AKI prediction models bas...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7507620/ https://www.ncbi.nlm.nih.gov/pubmed/32957977 http://dx.doi.org/10.1186/s12911-020-01245-4 |
_version_ | 1783585265269342208 |
---|---|
author | Wang, Yuan Wei, Yake Yang, Hao Li, Jingwei Zhou, Yubo Wu, Qin |
author_facet | Wang, Yuan Wei, Yake Yang, Hao Li, Jingwei Zhou, Yubo Wu, Qin |
author_sort | Wang, Yuan |
collection | PubMed |
description | BACKGROUND: Acute Kidney Injury (AKI) is a shared complication among Intensive Care Unit (ICU), marked by high cost, high morbidity and high mortality. As the early prediction of AKI is critical for patients’ outcomes and data mining is such a powerful prediction tool, many AKI prediction models based on machine learning methods have been proposed. Our motivation is inspired by the fact that the incidence of AKI is a changing temporal sequence affected by the joint action of patients’ daily drug combinations and their physiological indexes. However, most existing models have not considered such a temporal correlation. Besides, due to great challenges caused by sparse, high-dimensional and highly imbalanced clinical data, it is hard to achieve ideal performance. METHODS: We develop a fast, simple and less-costly model based on an ensemble learning algorithm, named Ensemble Time Series Model (ETSM). Besides benefiting from vital signs and laboratory results as explicit indicators, ETSM explores the effect of drug combinations as possible implicit indicators for the AKI prediction. The model transforms temporal medication information into a multidimensional vector to consider and measure drug cumulative effects that may cause AKI. RESULTS: We compare ETSM with state-of-the-art models on ICUC and MIMIC III datasets. On the basis of the experimental results, our model obtains satisfactory performance (ICUC: AUC 24 hours ahead: 0.81, 48 hours ahead: 0.78; MIMIC III: AUC 24 hours ahead: 0.95, 48 hours ahead: 0.95). Meanwhile, we compare the effects of different sampling and feature generation methods on the model performance. In the ablation study, we validate that medication information improves model performance (24 hours ahead: AUC increased from 0.74 to 0.81). We also find that the model’s performance is closely related to the balanced level of the derivation dataset. The optimal ratio of major class size to minor class size for the model is found for AKI prediction. CONCLUSIONS: ETSM is an effective method for the early prediction of AKI. The model verifies that AKI incidence is related to the clinical medication. In comparison with other prediction methods, ETSM provides comparable performance results and better interpretability. |
format | Online Article Text |
id | pubmed-7507620 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-75076202020-09-23 Utilizing imbalanced electronic health records to predict acute kidney injury by ensemble learning and time series model Wang, Yuan Wei, Yake Yang, Hao Li, Jingwei Zhou, Yubo Wu, Qin BMC Med Inform Decis Mak Research Article BACKGROUND: Acute Kidney Injury (AKI) is a shared complication among Intensive Care Unit (ICU), marked by high cost, high morbidity and high mortality. As the early prediction of AKI is critical for patients’ outcomes and data mining is such a powerful prediction tool, many AKI prediction models based on machine learning methods have been proposed. Our motivation is inspired by the fact that the incidence of AKI is a changing temporal sequence affected by the joint action of patients’ daily drug combinations and their physiological indexes. However, most existing models have not considered such a temporal correlation. Besides, due to great challenges caused by sparse, high-dimensional and highly imbalanced clinical data, it is hard to achieve ideal performance. METHODS: We develop a fast, simple and less-costly model based on an ensemble learning algorithm, named Ensemble Time Series Model (ETSM). Besides benefiting from vital signs and laboratory results as explicit indicators, ETSM explores the effect of drug combinations as possible implicit indicators for the AKI prediction. The model transforms temporal medication information into a multidimensional vector to consider and measure drug cumulative effects that may cause AKI. RESULTS: We compare ETSM with state-of-the-art models on ICUC and MIMIC III datasets. On the basis of the experimental results, our model obtains satisfactory performance (ICUC: AUC 24 hours ahead: 0.81, 48 hours ahead: 0.78; MIMIC III: AUC 24 hours ahead: 0.95, 48 hours ahead: 0.95). Meanwhile, we compare the effects of different sampling and feature generation methods on the model performance. In the ablation study, we validate that medication information improves model performance (24 hours ahead: AUC increased from 0.74 to 0.81). We also find that the model’s performance is closely related to the balanced level of the derivation dataset. The optimal ratio of major class size to minor class size for the model is found for AKI prediction. CONCLUSIONS: ETSM is an effective method for the early prediction of AKI. The model verifies that AKI incidence is related to the clinical medication. In comparison with other prediction methods, ETSM provides comparable performance results and better interpretability. BioMed Central 2020-09-21 /pmc/articles/PMC7507620/ /pubmed/32957977 http://dx.doi.org/10.1186/s12911-020-01245-4 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Article Wang, Yuan Wei, Yake Yang, Hao Li, Jingwei Zhou, Yubo Wu, Qin Utilizing imbalanced electronic health records to predict acute kidney injury by ensemble learning and time series model |
title | Utilizing imbalanced electronic health records to predict acute kidney injury by ensemble learning and time series model |
title_full | Utilizing imbalanced electronic health records to predict acute kidney injury by ensemble learning and time series model |
title_fullStr | Utilizing imbalanced electronic health records to predict acute kidney injury by ensemble learning and time series model |
title_full_unstemmed | Utilizing imbalanced electronic health records to predict acute kidney injury by ensemble learning and time series model |
title_short | Utilizing imbalanced electronic health records to predict acute kidney injury by ensemble learning and time series model |
title_sort | utilizing imbalanced electronic health records to predict acute kidney injury by ensemble learning and time series model |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7507620/ https://www.ncbi.nlm.nih.gov/pubmed/32957977 http://dx.doi.org/10.1186/s12911-020-01245-4 |
work_keys_str_mv | AT wangyuan utilizingimbalancedelectronichealthrecordstopredictacutekidneyinjurybyensemblelearningandtimeseriesmodel AT weiyake utilizingimbalancedelectronichealthrecordstopredictacutekidneyinjurybyensemblelearningandtimeseriesmodel AT yanghao utilizingimbalancedelectronichealthrecordstopredictacutekidneyinjurybyensemblelearningandtimeseriesmodel AT lijingwei utilizingimbalancedelectronichealthrecordstopredictacutekidneyinjurybyensemblelearningandtimeseriesmodel AT zhouyubo utilizingimbalancedelectronichealthrecordstopredictacutekidneyinjurybyensemblelearningandtimeseriesmodel AT wuqin utilizingimbalancedelectronichealthrecordstopredictacutekidneyinjurybyensemblelearningandtimeseriesmodel |