Cargando…

Tree‐Based Machine Learning to Identify and Understand Major Determinants for Stroke at the Neighborhood Level

BACKGROUND: Stroke is a major cardiovascular disease that causes significant health and economic burden in the United States. Neighborhood community‐based interventions have been shown to be both effective and cost‐effective in preventing cardiovascular disease. There is a dearth of robust studies i...

Descripción completa

Detalles Bibliográficos
Autores principales: Hu, Liangyuan, Liu, Bian, Ji, Jiayi, Li, Yan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7763737/
https://www.ncbi.nlm.nih.gov/pubmed/33140687
http://dx.doi.org/10.1161/JAHA.120.016745
Descripción
Sumario:BACKGROUND: Stroke is a major cardiovascular disease that causes significant health and economic burden in the United States. Neighborhood community‐based interventions have been shown to be both effective and cost‐effective in preventing cardiovascular disease. There is a dearth of robust studies identifying the key determinants of cardiovascular disease and the underlying effect mechanisms at the neighborhood level. We aim to contribute to the evidence base for neighborhood cardiovascular health research. METHODS AND RESULTS: We created a new neighborhood health data set at the census tract level by integrating 4 types of potential predictors, including unhealthy behaviors, prevention measures, sociodemographic factors, and environmental measures from multiple data sources. We used 4 tree‐based machine learning techniques to identify the most critical neighborhood‐level factors in predicting the neighborhood‐level prevalence of stroke, and compared their predictive performance for variable selection. We further quantified the effects of the identified determinants on stroke prevalence using a Bayesian linear regression model. Of the 5 most important predictors identified by our method, higher prevalence of low physical activity, larger share of older adults, higher percentage of non‐Hispanic Black people, and higher ozone levels were associated with higher prevalence of stroke at the neighborhood level. Higher median household income was linked to lower prevalence. The most important interaction term showed an exacerbated adverse effect of aging and low physical activity on the neighborhood‐level prevalence of stroke. CONCLUSIONS: Tree‐based machine learning provides insights into underlying drivers of neighborhood cardiovascular health by discovering the most important determinants from a wide range of factors in an agnostic, data‐driven, and reproducible way. The identified major determinants and the interactive mechanism can be used to prioritize and allocate resources to optimize community‐level interventions for stroke prevention.