Cargando…

Machine learning driven by environmental covariates to estimate high-resolution PM2.5 in data-poor regions

PM(2.5), which refers to fine particles with an equivalent aerodynamic diameter of less than or equal to 2.5 µm, can not only affect air quality but also endanger public health. Nevertheless, the spatial distribution of PM(2.5) is not well understood in data-poor regions where monitoring stations ar...

Descripción completa

Detalles Bibliográficos
Autores principales: Jin, XiaoYe, Ding, Jianli, Ge, Xiangyu, Liu, Jie, Xie, Boqiang, Zhao, Shuang, Zhao, Qiaozhen
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8976473/
https://www.ncbi.nlm.nih.gov/pubmed/35378927
http://dx.doi.org/10.7717/peerj.13203
_version_ 1784680576981139456
author Jin, XiaoYe
Ding, Jianli
Ge, Xiangyu
Liu, Jie
Xie, Boqiang
Zhao, Shuang
Zhao, Qiaozhen
author_facet Jin, XiaoYe
Ding, Jianli
Ge, Xiangyu
Liu, Jie
Xie, Boqiang
Zhao, Shuang
Zhao, Qiaozhen
author_sort Jin, XiaoYe
collection PubMed
description PM(2.5), which refers to fine particles with an equivalent aerodynamic diameter of less than or equal to 2.5 µm, can not only affect air quality but also endanger public health. Nevertheless, the spatial distribution of PM(2.5) is not well understood in data-poor regions where monitoring stations are scarce. Therefore, we constructed a random forest (RF) model and a bagging algorithm model based on ground-monitored PM(2.5) data, aerosol optical depth (AOD) and meteorological data, and auxiliary geographical variables to accurately estimate the spatial distribution of PM(2.5) concentrations in Xinjiang during 2015–2020 at a resolution of 1 km. Through 10-fold cross-validation (CV), the RF model and bagging algorithm model were verified and compared. The results showed the following: (1) The RF model achieved better model performance and thus can be used to estimate the PM(2.5) concentration at a relatively high resolution. (2) The PM(2.5) concentrations were high in southern Xinjiang and low in northern Xinjiang. The high values were concentrated mainly in the Tarim Basin, while most areas of northern Xinjiang maintained low PM(2.5) levels year-round. (3) The PM(2.5) values in Xinjiang showed significant seasonality, with the seasonally averaged concentrations decreasing as follows: winter (71.95 µg m(−3)) > spring (64.76 µg m(−3)) > autumn (46.01 µg m(−3)) > summer (43.40 µg m(−3)). Our model provides a way to monitor air quality in data-scarce places, thereby advancing efforts to achieve sustainable development in the future.
format Online
Article
Text
id pubmed-8976473
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-89764732022-04-03 Machine learning driven by environmental covariates to estimate high-resolution PM2.5 in data-poor regions Jin, XiaoYe Ding, Jianli Ge, Xiangyu Liu, Jie Xie, Boqiang Zhao, Shuang Zhao, Qiaozhen PeerJ Atmospheric Chemistry PM(2.5), which refers to fine particles with an equivalent aerodynamic diameter of less than or equal to 2.5 µm, can not only affect air quality but also endanger public health. Nevertheless, the spatial distribution of PM(2.5) is not well understood in data-poor regions where monitoring stations are scarce. Therefore, we constructed a random forest (RF) model and a bagging algorithm model based on ground-monitored PM(2.5) data, aerosol optical depth (AOD) and meteorological data, and auxiliary geographical variables to accurately estimate the spatial distribution of PM(2.5) concentrations in Xinjiang during 2015–2020 at a resolution of 1 km. Through 10-fold cross-validation (CV), the RF model and bagging algorithm model were verified and compared. The results showed the following: (1) The RF model achieved better model performance and thus can be used to estimate the PM(2.5) concentration at a relatively high resolution. (2) The PM(2.5) concentrations were high in southern Xinjiang and low in northern Xinjiang. The high values were concentrated mainly in the Tarim Basin, while most areas of northern Xinjiang maintained low PM(2.5) levels year-round. (3) The PM(2.5) values in Xinjiang showed significant seasonality, with the seasonally averaged concentrations decreasing as follows: winter (71.95 µg m(−3)) > spring (64.76 µg m(−3)) > autumn (46.01 µg m(−3)) > summer (43.40 µg m(−3)). Our model provides a way to monitor air quality in data-scarce places, thereby advancing efforts to achieve sustainable development in the future. PeerJ Inc. 2022-03-30 /pmc/articles/PMC8976473/ /pubmed/35378927 http://dx.doi.org/10.7717/peerj.13203 Text en ©2022 Jin et al. https://creativecommons.org/licenses/by-nc/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by-nc/4.0/) , which permits using, remixing, and building upon the work non-commercially, as long as it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle Atmospheric Chemistry
Jin, XiaoYe
Ding, Jianli
Ge, Xiangyu
Liu, Jie
Xie, Boqiang
Zhao, Shuang
Zhao, Qiaozhen
Machine learning driven by environmental covariates to estimate high-resolution PM2.5 in data-poor regions
title Machine learning driven by environmental covariates to estimate high-resolution PM2.5 in data-poor regions
title_full Machine learning driven by environmental covariates to estimate high-resolution PM2.5 in data-poor regions
title_fullStr Machine learning driven by environmental covariates to estimate high-resolution PM2.5 in data-poor regions
title_full_unstemmed Machine learning driven by environmental covariates to estimate high-resolution PM2.5 in data-poor regions
title_short Machine learning driven by environmental covariates to estimate high-resolution PM2.5 in data-poor regions
title_sort machine learning driven by environmental covariates to estimate high-resolution pm2.5 in data-poor regions
topic Atmospheric Chemistry
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8976473/
https://www.ncbi.nlm.nih.gov/pubmed/35378927
http://dx.doi.org/10.7717/peerj.13203
work_keys_str_mv AT jinxiaoye machinelearningdrivenbyenvironmentalcovariatestoestimatehighresolutionpm25indatapoorregions
AT dingjianli machinelearningdrivenbyenvironmentalcovariatestoestimatehighresolutionpm25indatapoorregions
AT gexiangyu machinelearningdrivenbyenvironmentalcovariatestoestimatehighresolutionpm25indatapoorregions
AT liujie machinelearningdrivenbyenvironmentalcovariatestoestimatehighresolutionpm25indatapoorregions
AT xieboqiang machinelearningdrivenbyenvironmentalcovariatestoestimatehighresolutionpm25indatapoorregions
AT zhaoshuang machinelearningdrivenbyenvironmentalcovariatestoestimatehighresolutionpm25indatapoorregions
AT zhaoqiaozhen machinelearningdrivenbyenvironmentalcovariatestoestimatehighresolutionpm25indatapoorregions