Cargando…
Tree‐Based Machine Learning to Identify and Understand Major Determinants for Stroke at the Neighborhood Level
BACKGROUND: Stroke is a major cardiovascular disease that causes significant health and economic burden in the United States. Neighborhood community‐based interventions have been shown to be both effective and cost‐effective in preventing cardiovascular disease. There is a dearth of robust studies i...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
John Wiley and Sons Inc.
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7763737/ https://www.ncbi.nlm.nih.gov/pubmed/33140687 http://dx.doi.org/10.1161/JAHA.120.016745 |
_version_ | 1783628089357500416 |
---|---|
author | Hu, Liangyuan Liu, Bian Ji, Jiayi Li, Yan |
author_facet | Hu, Liangyuan Liu, Bian Ji, Jiayi Li, Yan |
author_sort | Hu, Liangyuan |
collection | PubMed |
description | BACKGROUND: Stroke is a major cardiovascular disease that causes significant health and economic burden in the United States. Neighborhood community‐based interventions have been shown to be both effective and cost‐effective in preventing cardiovascular disease. There is a dearth of robust studies identifying the key determinants of cardiovascular disease and the underlying effect mechanisms at the neighborhood level. We aim to contribute to the evidence base for neighborhood cardiovascular health research. METHODS AND RESULTS: We created a new neighborhood health data set at the census tract level by integrating 4 types of potential predictors, including unhealthy behaviors, prevention measures, sociodemographic factors, and environmental measures from multiple data sources. We used 4 tree‐based machine learning techniques to identify the most critical neighborhood‐level factors in predicting the neighborhood‐level prevalence of stroke, and compared their predictive performance for variable selection. We further quantified the effects of the identified determinants on stroke prevalence using a Bayesian linear regression model. Of the 5 most important predictors identified by our method, higher prevalence of low physical activity, larger share of older adults, higher percentage of non‐Hispanic Black people, and higher ozone levels were associated with higher prevalence of stroke at the neighborhood level. Higher median household income was linked to lower prevalence. The most important interaction term showed an exacerbated adverse effect of aging and low physical activity on the neighborhood‐level prevalence of stroke. CONCLUSIONS: Tree‐based machine learning provides insights into underlying drivers of neighborhood cardiovascular health by discovering the most important determinants from a wide range of factors in an agnostic, data‐driven, and reproducible way. The identified major determinants and the interactive mechanism can be used to prioritize and allocate resources to optimize community‐level interventions for stroke prevention. |
format | Online Article Text |
id | pubmed-7763737 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | John Wiley and Sons Inc. |
record_format | MEDLINE/PubMed |
spelling | pubmed-77637372020-12-28 Tree‐Based Machine Learning to Identify and Understand Major Determinants for Stroke at the Neighborhood Level Hu, Liangyuan Liu, Bian Ji, Jiayi Li, Yan J Am Heart Assoc Original Research BACKGROUND: Stroke is a major cardiovascular disease that causes significant health and economic burden in the United States. Neighborhood community‐based interventions have been shown to be both effective and cost‐effective in preventing cardiovascular disease. There is a dearth of robust studies identifying the key determinants of cardiovascular disease and the underlying effect mechanisms at the neighborhood level. We aim to contribute to the evidence base for neighborhood cardiovascular health research. METHODS AND RESULTS: We created a new neighborhood health data set at the census tract level by integrating 4 types of potential predictors, including unhealthy behaviors, prevention measures, sociodemographic factors, and environmental measures from multiple data sources. We used 4 tree‐based machine learning techniques to identify the most critical neighborhood‐level factors in predicting the neighborhood‐level prevalence of stroke, and compared their predictive performance for variable selection. We further quantified the effects of the identified determinants on stroke prevalence using a Bayesian linear regression model. Of the 5 most important predictors identified by our method, higher prevalence of low physical activity, larger share of older adults, higher percentage of non‐Hispanic Black people, and higher ozone levels were associated with higher prevalence of stroke at the neighborhood level. Higher median household income was linked to lower prevalence. The most important interaction term showed an exacerbated adverse effect of aging and low physical activity on the neighborhood‐level prevalence of stroke. CONCLUSIONS: Tree‐based machine learning provides insights into underlying drivers of neighborhood cardiovascular health by discovering the most important determinants from a wide range of factors in an agnostic, data‐driven, and reproducible way. The identified major determinants and the interactive mechanism can be used to prioritize and allocate resources to optimize community‐level interventions for stroke prevention. John Wiley and Sons Inc. 2020-11-03 /pmc/articles/PMC7763737/ /pubmed/33140687 http://dx.doi.org/10.1161/JAHA.120.016745 Text en © 2020 The Authors. Published on behalf of the American Heart Association, Inc., by Wiley. This is an open access article under the terms of the http://creativecommons.org/licenses/by-nc-nd/4.0/ License, which permits use and distribution in any medium, provided the original work is properly cited, the use is non‐commercial and no modifications or adaptations are made. |
spellingShingle | Original Research Hu, Liangyuan Liu, Bian Ji, Jiayi Li, Yan Tree‐Based Machine Learning to Identify and Understand Major Determinants for Stroke at the Neighborhood Level |
title | Tree‐Based Machine Learning to Identify and Understand Major Determinants for Stroke at the Neighborhood Level |
title_full | Tree‐Based Machine Learning to Identify and Understand Major Determinants for Stroke at the Neighborhood Level |
title_fullStr | Tree‐Based Machine Learning to Identify and Understand Major Determinants for Stroke at the Neighborhood Level |
title_full_unstemmed | Tree‐Based Machine Learning to Identify and Understand Major Determinants for Stroke at the Neighborhood Level |
title_short | Tree‐Based Machine Learning to Identify and Understand Major Determinants for Stroke at the Neighborhood Level |
title_sort | tree‐based machine learning to identify and understand major determinants for stroke at the neighborhood level |
topic | Original Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7763737/ https://www.ncbi.nlm.nih.gov/pubmed/33140687 http://dx.doi.org/10.1161/JAHA.120.016745 |
work_keys_str_mv | AT huliangyuan treebasedmachinelearningtoidentifyandunderstandmajordeterminantsforstrokeattheneighborhoodlevel AT liubian treebasedmachinelearningtoidentifyandunderstandmajordeterminantsforstrokeattheneighborhoodlevel AT jijiayi treebasedmachinelearningtoidentifyandunderstandmajordeterminantsforstrokeattheneighborhoodlevel AT liyan treebasedmachinelearningtoidentifyandunderstandmajordeterminantsforstrokeattheneighborhoodlevel |