Cargando…

Comparison of Machine Learning Methods With Traditional Models for Use of Administrative Claims With Electronic Medical Records to Predict Heart Failure Outcomes

IMPORTANCE: Accurate risk stratification of patients with heart failure (HF) is critical to deploy targeted interventions aimed at improving patients’ quality of life and outcomes. OBJECTIVES: To compare machine learning approaches with traditional logistic regression in predicting key outcomes in p...

Descripción completa

Detalles Bibliográficos
Autores principales: Desai, Rishi J., Wang, Shirley V., Vaduganathan, Muthiah, Evers, Thomas, Schneeweiss, Sebastian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Medical Association 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6991258/
https://www.ncbi.nlm.nih.gov/pubmed/31922560
http://dx.doi.org/10.1001/jamanetworkopen.2019.18962
_version_ 1783492621804503040
author Desai, Rishi J.
Wang, Shirley V.
Vaduganathan, Muthiah
Evers, Thomas
Schneeweiss, Sebastian
author_facet Desai, Rishi J.
Wang, Shirley V.
Vaduganathan, Muthiah
Evers, Thomas
Schneeweiss, Sebastian
author_sort Desai, Rishi J.
collection PubMed
description IMPORTANCE: Accurate risk stratification of patients with heart failure (HF) is critical to deploy targeted interventions aimed at improving patients’ quality of life and outcomes. OBJECTIVES: To compare machine learning approaches with traditional logistic regression in predicting key outcomes in patients with HF and evaluate the added value of augmenting claims-based predictive models with electronic medical record (EMR)–derived information. DESIGN, SETTING, AND PARTICIPANTS: A prognostic study with a 1-year follow-up period was conducted including 9502 Medicare-enrolled patients with HF from 2 health care provider networks in Boston, Massachusetts (“providers” includes physicians, clinicians, other health care professionals, and their institutions that comprise the networks). The study was performed from January 1, 2007, to December 31, 2014; data were analyzed from January 1 to December 31, 2018. MAIN OUTCOMES AND MEASURES: All-cause mortality, HF hospitalization, top cost decile, and home days loss greater than 25% were modeled using logistic regression, least absolute shrinkage and selection operation regression, classification and regression trees, random forests, and gradient-boosted modeling (GBM). All models were trained using data from network 1 and tested in network 2. After selecting the most efficient modeling approach based on discrimination, Brier score, and calibration, area under precision-recall curves (AUPRCs) and net benefit estimates from decision curves were calculated to focus on the differences when using claims-only vs claims + EMR predictors. RESULTS: A total of 9502 patients with HF with a mean (SD) age of 78 (8) years were included: 6113 from network 1 (training set) and 3389 from network 2 (testing set). Gradient-boosted modeling consistently provided the highest discrimination, lowest Brier scores, and good calibration across all 4 outcomes; however, logistic regression had generally similar performance (C statistics for logistic regression based on claims-only predictors: mortality, 0.724; 95% CI, 0.705-0.744; HF hospitalization, 0.707; 95% CI, 0.676-0.737; high cost, 0.734; 95% CI, 0.703-0.764; and home days loss claims only, 0.781; 95% CI, 0.764-0.798; C statistics for GBM: mortality, 0.727; 95% CI, 0.708-0.747; HF hospitalization, 0.745; 95% CI, 0.718-0.772; high cost, 0.733; 95% CI, 0.703-0.763; and home days loss, 0.790; 95% CI, 0.773-0.807). Higher AUPRCs were obtained for claims + EMR vs claims-only GBMs predicting mortality (0.484 vs 0.423), HF hospitalization (0.413 vs 0.403), and home time loss (0.575 vs 0.521) but not cost (0.249 vs 0.252). The net benefit for claims + EMR vs claims-only GBMs was higher at various threshold probabilities for mortality and home time loss outcomes but similar for the other 2 outcomes. CONCLUSIONS AND RELEVANCE: Machine learning methods offered only limited improvement over traditional logistic regression in predicting key HF outcomes. Inclusion of additional predictors from EMRs to claims-based models appeared to improve prediction for some, but not all, outcomes.
format Online
Article
Text
id pubmed-6991258
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher American Medical Association
record_format MEDLINE/PubMed
spelling pubmed-69912582020-02-11 Comparison of Machine Learning Methods With Traditional Models for Use of Administrative Claims With Electronic Medical Records to Predict Heart Failure Outcomes Desai, Rishi J. Wang, Shirley V. Vaduganathan, Muthiah Evers, Thomas Schneeweiss, Sebastian JAMA Netw Open Original Investigation IMPORTANCE: Accurate risk stratification of patients with heart failure (HF) is critical to deploy targeted interventions aimed at improving patients’ quality of life and outcomes. OBJECTIVES: To compare machine learning approaches with traditional logistic regression in predicting key outcomes in patients with HF and evaluate the added value of augmenting claims-based predictive models with electronic medical record (EMR)–derived information. DESIGN, SETTING, AND PARTICIPANTS: A prognostic study with a 1-year follow-up period was conducted including 9502 Medicare-enrolled patients with HF from 2 health care provider networks in Boston, Massachusetts (“providers” includes physicians, clinicians, other health care professionals, and their institutions that comprise the networks). The study was performed from January 1, 2007, to December 31, 2014; data were analyzed from January 1 to December 31, 2018. MAIN OUTCOMES AND MEASURES: All-cause mortality, HF hospitalization, top cost decile, and home days loss greater than 25% were modeled using logistic regression, least absolute shrinkage and selection operation regression, classification and regression trees, random forests, and gradient-boosted modeling (GBM). All models were trained using data from network 1 and tested in network 2. After selecting the most efficient modeling approach based on discrimination, Brier score, and calibration, area under precision-recall curves (AUPRCs) and net benefit estimates from decision curves were calculated to focus on the differences when using claims-only vs claims + EMR predictors. RESULTS: A total of 9502 patients with HF with a mean (SD) age of 78 (8) years were included: 6113 from network 1 (training set) and 3389 from network 2 (testing set). Gradient-boosted modeling consistently provided the highest discrimination, lowest Brier scores, and good calibration across all 4 outcomes; however, logistic regression had generally similar performance (C statistics for logistic regression based on claims-only predictors: mortality, 0.724; 95% CI, 0.705-0.744; HF hospitalization, 0.707; 95% CI, 0.676-0.737; high cost, 0.734; 95% CI, 0.703-0.764; and home days loss claims only, 0.781; 95% CI, 0.764-0.798; C statistics for GBM: mortality, 0.727; 95% CI, 0.708-0.747; HF hospitalization, 0.745; 95% CI, 0.718-0.772; high cost, 0.733; 95% CI, 0.703-0.763; and home days loss, 0.790; 95% CI, 0.773-0.807). Higher AUPRCs were obtained for claims + EMR vs claims-only GBMs predicting mortality (0.484 vs 0.423), HF hospitalization (0.413 vs 0.403), and home time loss (0.575 vs 0.521) but not cost (0.249 vs 0.252). The net benefit for claims + EMR vs claims-only GBMs was higher at various threshold probabilities for mortality and home time loss outcomes but similar for the other 2 outcomes. CONCLUSIONS AND RELEVANCE: Machine learning methods offered only limited improvement over traditional logistic regression in predicting key HF outcomes. Inclusion of additional predictors from EMRs to claims-based models appeared to improve prediction for some, but not all, outcomes. American Medical Association 2020-01-10 /pmc/articles/PMC6991258/ /pubmed/31922560 http://dx.doi.org/10.1001/jamanetworkopen.2019.18962 Text en Copyright 2020 Desai RJ et al. JAMA Network Open. https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article distributed under the terms of the CC-BY-NC-ND License.
spellingShingle Original Investigation
Desai, Rishi J.
Wang, Shirley V.
Vaduganathan, Muthiah
Evers, Thomas
Schneeweiss, Sebastian
Comparison of Machine Learning Methods With Traditional Models for Use of Administrative Claims With Electronic Medical Records to Predict Heart Failure Outcomes
title Comparison of Machine Learning Methods With Traditional Models for Use of Administrative Claims With Electronic Medical Records to Predict Heart Failure Outcomes
title_full Comparison of Machine Learning Methods With Traditional Models for Use of Administrative Claims With Electronic Medical Records to Predict Heart Failure Outcomes
title_fullStr Comparison of Machine Learning Methods With Traditional Models for Use of Administrative Claims With Electronic Medical Records to Predict Heart Failure Outcomes
title_full_unstemmed Comparison of Machine Learning Methods With Traditional Models for Use of Administrative Claims With Electronic Medical Records to Predict Heart Failure Outcomes
title_short Comparison of Machine Learning Methods With Traditional Models for Use of Administrative Claims With Electronic Medical Records to Predict Heart Failure Outcomes
title_sort comparison of machine learning methods with traditional models for use of administrative claims with electronic medical records to predict heart failure outcomes
topic Original Investigation
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6991258/
https://www.ncbi.nlm.nih.gov/pubmed/31922560
http://dx.doi.org/10.1001/jamanetworkopen.2019.18962
work_keys_str_mv AT desairishij comparisonofmachinelearningmethodswithtraditionalmodelsforuseofadministrativeclaimswithelectronicmedicalrecordstopredictheartfailureoutcomes
AT wangshirleyv comparisonofmachinelearningmethodswithtraditionalmodelsforuseofadministrativeclaimswithelectronicmedicalrecordstopredictheartfailureoutcomes
AT vaduganathanmuthiah comparisonofmachinelearningmethodswithtraditionalmodelsforuseofadministrativeclaimswithelectronicmedicalrecordstopredictheartfailureoutcomes
AT eversthomas comparisonofmachinelearningmethodswithtraditionalmodelsforuseofadministrativeclaimswithelectronicmedicalrecordstopredictheartfailureoutcomes
AT schneeweisssebastian comparisonofmachinelearningmethodswithtraditionalmodelsforuseofadministrativeclaimswithelectronicmedicalrecordstopredictheartfailureoutcomes