Cargando…

Development of a Multilevel Model to Identify Patients at Risk for Delay in Starting Cancer Treatment

IMPORTANCE: Delays in starting cancer treatment disproportionately affect vulnerable populations and can influence patients’ experience and outcomes. Machine learning algorithms incorporating electronic health record (EHR) data and neighborhood-level social determinants of health (SDOH) measures may...

Descripción completa

Detalles Bibliográficos
Autores principales: Frosch, Zachary A. K., Hasler, Jill, Handorf, Elizabeth, DuBois, Tesla, Bleicher, Richard J., Edelman, Martin J., Geynisman, Daniel M., Hall, Michael J., Fang, Carolyn Y., Lynch, Shannon M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Medical Association 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10425824/
https://www.ncbi.nlm.nih.gov/pubmed/37578796
http://dx.doi.org/10.1001/jamanetworkopen.2023.28712
_version_ 1785089924256497664
author Frosch, Zachary A. K.
Hasler, Jill
Handorf, Elizabeth
DuBois, Tesla
Bleicher, Richard J.
Edelman, Martin J.
Geynisman, Daniel M.
Hall, Michael J.
Fang, Carolyn Y.
Lynch, Shannon M.
author_facet Frosch, Zachary A. K.
Hasler, Jill
Handorf, Elizabeth
DuBois, Tesla
Bleicher, Richard J.
Edelman, Martin J.
Geynisman, Daniel M.
Hall, Michael J.
Fang, Carolyn Y.
Lynch, Shannon M.
author_sort Frosch, Zachary A. K.
collection PubMed
description IMPORTANCE: Delays in starting cancer treatment disproportionately affect vulnerable populations and can influence patients’ experience and outcomes. Machine learning algorithms incorporating electronic health record (EHR) data and neighborhood-level social determinants of health (SDOH) measures may identify at-risk patients. OBJECTIVE: To develop and validate a machine learning model for estimating the probability of a treatment delay using multilevel data sources. DESIGN, SETTING, AND PARTICIPANTS: This cohort study evaluated 4 different machine learning approaches for estimating the likelihood of a treatment delay greater than 60 days (group least absolute shrinkage and selection operator [LASSO], bayesian additive regression tree, gradient boosting, and random forest). Criteria for selecting between approaches were discrimination, calibration, and interpretability/simplicity. The multilevel data set included clinical, demographic, and neighborhood-level census data derived from the EHR, cancer registry, and American Community Survey. Patients with invasive breast, lung, colorectal, bladder, or kidney cancer diagnosed from 2013 to 2019 and treated at a comprehensive cancer center were included. Data analysis was performed from January 2022 to June 2023. EXPOSURES: Variables included demographics, cancer characteristics, comorbidities, laboratory values, imaging orders, and neighborhood variables. MAIN OUTCOMES AND MEASURES: The outcome estimated by machine learning models was likelihood of a delay greater than 60 days between cancer diagnosis and treatment initiation. The primary metric used to evaluate model performance was area under the receiver operating characteristic curve (AUC-ROC). RESULTS: A total of 6409 patients were included (mean [SD] age, 62.8 [12.5] years; 4321 [67.4%] female; 2576 [40.2%] with breast cancer, 1738 [27.1%] with lung cancer, and 1059 [16.5%] with kidney cancer). A total of 1621 (25.3%) experienced a delay greater than 60 days. The selected group LASSO model had an AUC-ROC of 0.713 (95% CI, 0.679-0.745). Lower likelihood of delay was seen with diagnosis at the treating institution; first malignant neoplasm; Asian or Pacific Islander or White race; private insurance; and lacking comorbidities. Greater likelihood of delay was seen at the extremes of neighborhood deprivation. Model performance (AUC-ROC) was lower in Black patients, patients with race and ethnicity other than non-Hispanic White, and those living in the most disadvantaged neighborhoods. Though the model selected neighborhood SDOH variables as contributing variables, performance was similar when fit with and without these variables. CONCLUSIONS AND RELEVANCE: In this cohort study, a machine learning model incorporating EHR and SDOH data was able to estimate the likelihood of delays in starting cancer therapy. Future work should focus on additional ways to incorporate SDOH data to improve model performance, particularly in vulnerable populations.
format Online
Article
Text
id pubmed-10425824
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher American Medical Association
record_format MEDLINE/PubMed
spelling pubmed-104258242023-08-16 Development of a Multilevel Model to Identify Patients at Risk for Delay in Starting Cancer Treatment Frosch, Zachary A. K. Hasler, Jill Handorf, Elizabeth DuBois, Tesla Bleicher, Richard J. Edelman, Martin J. Geynisman, Daniel M. Hall, Michael J. Fang, Carolyn Y. Lynch, Shannon M. JAMA Netw Open Original Investigation IMPORTANCE: Delays in starting cancer treatment disproportionately affect vulnerable populations and can influence patients’ experience and outcomes. Machine learning algorithms incorporating electronic health record (EHR) data and neighborhood-level social determinants of health (SDOH) measures may identify at-risk patients. OBJECTIVE: To develop and validate a machine learning model for estimating the probability of a treatment delay using multilevel data sources. DESIGN, SETTING, AND PARTICIPANTS: This cohort study evaluated 4 different machine learning approaches for estimating the likelihood of a treatment delay greater than 60 days (group least absolute shrinkage and selection operator [LASSO], bayesian additive regression tree, gradient boosting, and random forest). Criteria for selecting between approaches were discrimination, calibration, and interpretability/simplicity. The multilevel data set included clinical, demographic, and neighborhood-level census data derived from the EHR, cancer registry, and American Community Survey. Patients with invasive breast, lung, colorectal, bladder, or kidney cancer diagnosed from 2013 to 2019 and treated at a comprehensive cancer center were included. Data analysis was performed from January 2022 to June 2023. EXPOSURES: Variables included demographics, cancer characteristics, comorbidities, laboratory values, imaging orders, and neighborhood variables. MAIN OUTCOMES AND MEASURES: The outcome estimated by machine learning models was likelihood of a delay greater than 60 days between cancer diagnosis and treatment initiation. The primary metric used to evaluate model performance was area under the receiver operating characteristic curve (AUC-ROC). RESULTS: A total of 6409 patients were included (mean [SD] age, 62.8 [12.5] years; 4321 [67.4%] female; 2576 [40.2%] with breast cancer, 1738 [27.1%] with lung cancer, and 1059 [16.5%] with kidney cancer). A total of 1621 (25.3%) experienced a delay greater than 60 days. The selected group LASSO model had an AUC-ROC of 0.713 (95% CI, 0.679-0.745). Lower likelihood of delay was seen with diagnosis at the treating institution; first malignant neoplasm; Asian or Pacific Islander or White race; private insurance; and lacking comorbidities. Greater likelihood of delay was seen at the extremes of neighborhood deprivation. Model performance (AUC-ROC) was lower in Black patients, patients with race and ethnicity other than non-Hispanic White, and those living in the most disadvantaged neighborhoods. Though the model selected neighborhood SDOH variables as contributing variables, performance was similar when fit with and without these variables. CONCLUSIONS AND RELEVANCE: In this cohort study, a machine learning model incorporating EHR and SDOH data was able to estimate the likelihood of delays in starting cancer therapy. Future work should focus on additional ways to incorporate SDOH data to improve model performance, particularly in vulnerable populations. American Medical Association 2023-08-14 /pmc/articles/PMC10425824/ /pubmed/37578796 http://dx.doi.org/10.1001/jamanetworkopen.2023.28712 Text en Copyright 2023 Frosch ZAK et al. JAMA Network Open. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the CC-BY License.
spellingShingle Original Investigation
Frosch, Zachary A. K.
Hasler, Jill
Handorf, Elizabeth
DuBois, Tesla
Bleicher, Richard J.
Edelman, Martin J.
Geynisman, Daniel M.
Hall, Michael J.
Fang, Carolyn Y.
Lynch, Shannon M.
Development of a Multilevel Model to Identify Patients at Risk for Delay in Starting Cancer Treatment
title Development of a Multilevel Model to Identify Patients at Risk for Delay in Starting Cancer Treatment
title_full Development of a Multilevel Model to Identify Patients at Risk for Delay in Starting Cancer Treatment
title_fullStr Development of a Multilevel Model to Identify Patients at Risk for Delay in Starting Cancer Treatment
title_full_unstemmed Development of a Multilevel Model to Identify Patients at Risk for Delay in Starting Cancer Treatment
title_short Development of a Multilevel Model to Identify Patients at Risk for Delay in Starting Cancer Treatment
title_sort development of a multilevel model to identify patients at risk for delay in starting cancer treatment
topic Original Investigation
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10425824/
https://www.ncbi.nlm.nih.gov/pubmed/37578796
http://dx.doi.org/10.1001/jamanetworkopen.2023.28712
work_keys_str_mv AT froschzacharyak developmentofamultilevelmodeltoidentifypatientsatriskfordelayinstartingcancertreatment
AT haslerjill developmentofamultilevelmodeltoidentifypatientsatriskfordelayinstartingcancertreatment
AT handorfelizabeth developmentofamultilevelmodeltoidentifypatientsatriskfordelayinstartingcancertreatment
AT duboistesla developmentofamultilevelmodeltoidentifypatientsatriskfordelayinstartingcancertreatment
AT bleicherrichardj developmentofamultilevelmodeltoidentifypatientsatriskfordelayinstartingcancertreatment
AT edelmanmartinj developmentofamultilevelmodeltoidentifypatientsatriskfordelayinstartingcancertreatment
AT geynismandanielm developmentofamultilevelmodeltoidentifypatientsatriskfordelayinstartingcancertreatment
AT hallmichaelj developmentofamultilevelmodeltoidentifypatientsatriskfordelayinstartingcancertreatment
AT fangcarolyny developmentofamultilevelmodeltoidentifypatientsatriskfordelayinstartingcancertreatment
AT lynchshannonm developmentofamultilevelmodeltoidentifypatientsatriskfordelayinstartingcancertreatment