Cargando…

Application of Bayesian Additive Regression Trees for Estimating Daily Concentrations of PM(2.5) Components

Bayesian additive regression tree (BART) is a recent statistical method that combines ensemble learning and nonparametric regression. BART is constructed under a probabilistic framework that also allows for model-based prediction uncertainty quantification. We evaluated the application of BART in pr...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Tianyu, Geng, Guannan, Liu, Yang, Chang, Howard H.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8315111/
https://www.ncbi.nlm.nih.gov/pubmed/34322279
http://dx.doi.org/10.3390/atmos11111233
_version_ 1783729671489191936
author Zhang, Tianyu
Geng, Guannan
Liu, Yang
Chang, Howard H.
author_facet Zhang, Tianyu
Geng, Guannan
Liu, Yang
Chang, Howard H.
author_sort Zhang, Tianyu
collection PubMed
description Bayesian additive regression tree (BART) is a recent statistical method that combines ensemble learning and nonparametric regression. BART is constructed under a probabilistic framework that also allows for model-based prediction uncertainty quantification. We evaluated the application of BART in predicting daily concentrations of four fine particulate matter (PM(2.5)) components (elemental carbon, organic carbon, nitrate, and sulfate) in California during the period 2005 to 2014. We demonstrate in this paper how BART can be tuned to optimize prediction performance and how to evaluate variable importance. Our BART models included, as predictors, a large suite of land-use variables, meteorological conditions, satellite-derived aerosol optical depth parameters, and simulations from a chemical transport model. In cross-validation experiments, BART demonstrated good out-of-sample prediction performance at monitoring locations (R(2) from 0.62 to 0.73). More importantly, prediction intervals associated with concentration estimates from BART showed good coverage probability at locations with and without monitoring data. In our case study, major PM(2.5) components could be estimated with good accuracy, especially when collocated PM(2.5) total mass observations were available. In conclusion, BART is an attractive approach for modeling ambient air pollution levels, especially for its ability to provide uncertainty in estimates that may be useful for subsequent health impact and health effect analyses.
format Online
Article
Text
id pubmed-8315111
institution National Center for Biotechnology Information
language English
publishDate 2020
record_format MEDLINE/PubMed
spelling pubmed-83151112021-07-27 Application of Bayesian Additive Regression Trees for Estimating Daily Concentrations of PM(2.5) Components Zhang, Tianyu Geng, Guannan Liu, Yang Chang, Howard H. Atmosphere (Basel) Article Bayesian additive regression tree (BART) is a recent statistical method that combines ensemble learning and nonparametric regression. BART is constructed under a probabilistic framework that also allows for model-based prediction uncertainty quantification. We evaluated the application of BART in predicting daily concentrations of four fine particulate matter (PM(2.5)) components (elemental carbon, organic carbon, nitrate, and sulfate) in California during the period 2005 to 2014. We demonstrate in this paper how BART can be tuned to optimize prediction performance and how to evaluate variable importance. Our BART models included, as predictors, a large suite of land-use variables, meteorological conditions, satellite-derived aerosol optical depth parameters, and simulations from a chemical transport model. In cross-validation experiments, BART demonstrated good out-of-sample prediction performance at monitoring locations (R(2) from 0.62 to 0.73). More importantly, prediction intervals associated with concentration estimates from BART showed good coverage probability at locations with and without monitoring data. In our case study, major PM(2.5) components could be estimated with good accuracy, especially when collocated PM(2.5) total mass observations were available. In conclusion, BART is an attractive approach for modeling ambient air pollution levels, especially for its ability to provide uncertainty in estimates that may be useful for subsequent health impact and health effect analyses. 2020-11-16 2020-11 /pmc/articles/PMC8315111/ /pubmed/34322279 http://dx.doi.org/10.3390/atmos11111233 Text en https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) ).
spellingShingle Article
Zhang, Tianyu
Geng, Guannan
Liu, Yang
Chang, Howard H.
Application of Bayesian Additive Regression Trees for Estimating Daily Concentrations of PM(2.5) Components
title Application of Bayesian Additive Regression Trees for Estimating Daily Concentrations of PM(2.5) Components
title_full Application of Bayesian Additive Regression Trees for Estimating Daily Concentrations of PM(2.5) Components
title_fullStr Application of Bayesian Additive Regression Trees for Estimating Daily Concentrations of PM(2.5) Components
title_full_unstemmed Application of Bayesian Additive Regression Trees for Estimating Daily Concentrations of PM(2.5) Components
title_short Application of Bayesian Additive Regression Trees for Estimating Daily Concentrations of PM(2.5) Components
title_sort application of bayesian additive regression trees for estimating daily concentrations of pm(2.5) components
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8315111/
https://www.ncbi.nlm.nih.gov/pubmed/34322279
http://dx.doi.org/10.3390/atmos11111233
work_keys_str_mv AT zhangtianyu applicationofbayesianadditiveregressiontreesforestimatingdailyconcentrationsofpm25components
AT gengguannan applicationofbayesianadditiveregressiontreesforestimatingdailyconcentrationsofpm25components
AT liuyang applicationofbayesianadditiveregressiontreesforestimatingdailyconcentrationsofpm25components
AT changhowardh applicationofbayesianadditiveregressiontreesforestimatingdailyconcentrationsofpm25components