Cargando…

Tree‐Based Machine Learning to Identify and Understand Major Determinants for Stroke at the Neighborhood Level

BACKGROUND: Stroke is a major cardiovascular disease that causes significant health and economic burden in the United States. Neighborhood community‐based interventions have been shown to be both effective and cost‐effective in preventing cardiovascular disease. There is a dearth of robust studies i...

Descripción completa

Detalles Bibliográficos
Autores principales: Hu, Liangyuan, Liu, Bian, Ji, Jiayi, Li, Yan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley and Sons Inc. 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7763737/
https://www.ncbi.nlm.nih.gov/pubmed/33140687
http://dx.doi.org/10.1161/JAHA.120.016745
_version_ 1783628089357500416
author Hu, Liangyuan
Liu, Bian
Ji, Jiayi
Li, Yan
author_facet Hu, Liangyuan
Liu, Bian
Ji, Jiayi
Li, Yan
author_sort Hu, Liangyuan
collection PubMed
description BACKGROUND: Stroke is a major cardiovascular disease that causes significant health and economic burden in the United States. Neighborhood community‐based interventions have been shown to be both effective and cost‐effective in preventing cardiovascular disease. There is a dearth of robust studies identifying the key determinants of cardiovascular disease and the underlying effect mechanisms at the neighborhood level. We aim to contribute to the evidence base for neighborhood cardiovascular health research. METHODS AND RESULTS: We created a new neighborhood health data set at the census tract level by integrating 4 types of potential predictors, including unhealthy behaviors, prevention measures, sociodemographic factors, and environmental measures from multiple data sources. We used 4 tree‐based machine learning techniques to identify the most critical neighborhood‐level factors in predicting the neighborhood‐level prevalence of stroke, and compared their predictive performance for variable selection. We further quantified the effects of the identified determinants on stroke prevalence using a Bayesian linear regression model. Of the 5 most important predictors identified by our method, higher prevalence of low physical activity, larger share of older adults, higher percentage of non‐Hispanic Black people, and higher ozone levels were associated with higher prevalence of stroke at the neighborhood level. Higher median household income was linked to lower prevalence. The most important interaction term showed an exacerbated adverse effect of aging and low physical activity on the neighborhood‐level prevalence of stroke. CONCLUSIONS: Tree‐based machine learning provides insights into underlying drivers of neighborhood cardiovascular health by discovering the most important determinants from a wide range of factors in an agnostic, data‐driven, and reproducible way. The identified major determinants and the interactive mechanism can be used to prioritize and allocate resources to optimize community‐level interventions for stroke prevention.
format Online
Article
Text
id pubmed-7763737
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher John Wiley and Sons Inc.
record_format MEDLINE/PubMed
spelling pubmed-77637372020-12-28 Tree‐Based Machine Learning to Identify and Understand Major Determinants for Stroke at the Neighborhood Level Hu, Liangyuan Liu, Bian Ji, Jiayi Li, Yan J Am Heart Assoc Original Research BACKGROUND: Stroke is a major cardiovascular disease that causes significant health and economic burden in the United States. Neighborhood community‐based interventions have been shown to be both effective and cost‐effective in preventing cardiovascular disease. There is a dearth of robust studies identifying the key determinants of cardiovascular disease and the underlying effect mechanisms at the neighborhood level. We aim to contribute to the evidence base for neighborhood cardiovascular health research. METHODS AND RESULTS: We created a new neighborhood health data set at the census tract level by integrating 4 types of potential predictors, including unhealthy behaviors, prevention measures, sociodemographic factors, and environmental measures from multiple data sources. We used 4 tree‐based machine learning techniques to identify the most critical neighborhood‐level factors in predicting the neighborhood‐level prevalence of stroke, and compared their predictive performance for variable selection. We further quantified the effects of the identified determinants on stroke prevalence using a Bayesian linear regression model. Of the 5 most important predictors identified by our method, higher prevalence of low physical activity, larger share of older adults, higher percentage of non‐Hispanic Black people, and higher ozone levels were associated with higher prevalence of stroke at the neighborhood level. Higher median household income was linked to lower prevalence. The most important interaction term showed an exacerbated adverse effect of aging and low physical activity on the neighborhood‐level prevalence of stroke. CONCLUSIONS: Tree‐based machine learning provides insights into underlying drivers of neighborhood cardiovascular health by discovering the most important determinants from a wide range of factors in an agnostic, data‐driven, and reproducible way. The identified major determinants and the interactive mechanism can be used to prioritize and allocate resources to optimize community‐level interventions for stroke prevention. John Wiley and Sons Inc. 2020-11-03 /pmc/articles/PMC7763737/ /pubmed/33140687 http://dx.doi.org/10.1161/JAHA.120.016745 Text en © 2020 The Authors. Published on behalf of the American Heart Association, Inc., by Wiley. This is an open access article under the terms of the http://creativecommons.org/licenses/by-nc-nd/4.0/ License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non‐commercial and no modifications or adaptations are made.
spellingShingle Original Research
Hu, Liangyuan
Liu, Bian
Ji, Jiayi
Li, Yan
Tree‐Based Machine Learning to Identify and Understand Major Determinants for Stroke at the Neighborhood Level
title Tree‐Based Machine Learning to Identify and Understand Major Determinants for Stroke at the Neighborhood Level
title_full Tree‐Based Machine Learning to Identify and Understand Major Determinants for Stroke at the Neighborhood Level
title_fullStr Tree‐Based Machine Learning to Identify and Understand Major Determinants for Stroke at the Neighborhood Level
title_full_unstemmed Tree‐Based Machine Learning to Identify and Understand Major Determinants for Stroke at the Neighborhood Level
title_short Tree‐Based Machine Learning to Identify and Understand Major Determinants for Stroke at the Neighborhood Level
title_sort tree‐based machine learning to identify and understand major determinants for stroke at the neighborhood level
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7763737/
https://www.ncbi.nlm.nih.gov/pubmed/33140687
http://dx.doi.org/10.1161/JAHA.120.016745
work_keys_str_mv AT huliangyuan treebasedmachinelearningtoidentifyandunderstandmajordeterminantsforstrokeattheneighborhoodlevel
AT liubian treebasedmachinelearningtoidentifyandunderstandmajordeterminantsforstrokeattheneighborhoodlevel
AT jijiayi treebasedmachinelearningtoidentifyandunderstandmajordeterminantsforstrokeattheneighborhoodlevel
AT liyan treebasedmachinelearningtoidentifyandunderstandmajordeterminantsforstrokeattheneighborhoodlevel