Cargando…

An interpretable machine learning model of cross-sectional U.S. county-level obesity prevalence using explainable artificial intelligence

BACKGROUND: There is considerable geographic heterogeneity in obesity prevalence across counties in the United States. Machine learning algorithms accurately predict geographic variation in obesity prevalence, but the models are often uninterpretable and viewed as a black-box. OBJECTIVE: The goal of...

Descripción completa

Detalles Bibliográficos
Autor principal: Allen, Ben
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10553328/
https://www.ncbi.nlm.nih.gov/pubmed/37796874
http://dx.doi.org/10.1371/journal.pone.0292341
_version_ 1785116143444295680
author Allen, Ben
author_facet Allen, Ben
author_sort Allen, Ben
collection PubMed
description BACKGROUND: There is considerable geographic heterogeneity in obesity prevalence across counties in the United States. Machine learning algorithms accurately predict geographic variation in obesity prevalence, but the models are often uninterpretable and viewed as a black-box. OBJECTIVE: The goal of this study is to extract knowledge from machine learning models for county-level variation in obesity prevalence. METHODS: This study shows the application of explainable artificial intelligence methods to machine learning models of cross-sectional obesity prevalence data collected from 3,142 counties in the United States. County-level features from 7 broad categories: health outcomes, health behaviors, clinical care, social and economic factors, physical environment, demographics, and severe housing conditions. Explainable methods applied to random forest prediction models include feature importance, accumulated local effects, global surrogate decision tree, and local interpretable model-agnostic explanations. RESULTS: The results show that machine learning models explained 79% of the variance in obesity prevalence, with physical inactivity, diabetes, and smoking prevalence being the most important factors in predicting obesity prevalence. CONCLUSIONS: Interpretable machine learning models of health behaviors and outcomes provide substantial insight into obesity prevalence variation across counties in the United States.
format Online
Article
Text
id pubmed-10553328
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-105533282023-10-06 An interpretable machine learning model of cross-sectional U.S. county-level obesity prevalence using explainable artificial intelligence Allen, Ben PLoS One Research Article BACKGROUND: There is considerable geographic heterogeneity in obesity prevalence across counties in the United States. Machine learning algorithms accurately predict geographic variation in obesity prevalence, but the models are often uninterpretable and viewed as a black-box. OBJECTIVE: The goal of this study is to extract knowledge from machine learning models for county-level variation in obesity prevalence. METHODS: This study shows the application of explainable artificial intelligence methods to machine learning models of cross-sectional obesity prevalence data collected from 3,142 counties in the United States. County-level features from 7 broad categories: health outcomes, health behaviors, clinical care, social and economic factors, physical environment, demographics, and severe housing conditions. Explainable methods applied to random forest prediction models include feature importance, accumulated local effects, global surrogate decision tree, and local interpretable model-agnostic explanations. RESULTS: The results show that machine learning models explained 79% of the variance in obesity prevalence, with physical inactivity, diabetes, and smoking prevalence being the most important factors in predicting obesity prevalence. CONCLUSIONS: Interpretable machine learning models of health behaviors and outcomes provide substantial insight into obesity prevalence variation across counties in the United States. Public Library of Science 2023-10-05 /pmc/articles/PMC10553328/ /pubmed/37796874 http://dx.doi.org/10.1371/journal.pone.0292341 Text en © 2023 Ben Allen https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Allen, Ben
An interpretable machine learning model of cross-sectional U.S. county-level obesity prevalence using explainable artificial intelligence
title An interpretable machine learning model of cross-sectional U.S. county-level obesity prevalence using explainable artificial intelligence
title_full An interpretable machine learning model of cross-sectional U.S. county-level obesity prevalence using explainable artificial intelligence
title_fullStr An interpretable machine learning model of cross-sectional U.S. county-level obesity prevalence using explainable artificial intelligence
title_full_unstemmed An interpretable machine learning model of cross-sectional U.S. county-level obesity prevalence using explainable artificial intelligence
title_short An interpretable machine learning model of cross-sectional U.S. county-level obesity prevalence using explainable artificial intelligence
title_sort interpretable machine learning model of cross-sectional u.s. county-level obesity prevalence using explainable artificial intelligence
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10553328/
https://www.ncbi.nlm.nih.gov/pubmed/37796874
http://dx.doi.org/10.1371/journal.pone.0292341
work_keys_str_mv AT allenben aninterpretablemachinelearningmodelofcrosssectionaluscountylevelobesityprevalenceusingexplainableartificialintelligence
AT allenben interpretablemachinelearningmodelofcrosssectionaluscountylevelobesityprevalenceusingexplainableartificialintelligence