Cargando…

Ensemble-based classification approach for PM2.5 concentration forecasting using meteorological data

Air pollution is a serious challenge to humankind as it poses many health threats. It can be measured using the air quality index (AQI). Air pollution is the result of contamination of both outdoor and indoor environments. The AQI is being monitored by various institutions globally. The measured air...

Descripción completa

Detalles Bibliográficos
Autores principales: Saminathan, S., Malathy, C.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10289837/
https://www.ncbi.nlm.nih.gov/pubmed/37360751
http://dx.doi.org/10.3389/fdata.2023.1175259
_version_ 1785062364816605184
author Saminathan, S.
Malathy, C.
author_facet Saminathan, S.
Malathy, C.
author_sort Saminathan, S.
collection PubMed
description Air pollution is a serious challenge to humankind as it poses many health threats. It can be measured using the air quality index (AQI). Air pollution is the result of contamination of both outdoor and indoor environments. The AQI is being monitored by various institutions globally. The measured air quality data are kept mostly for public use. Using the previously calculated AQI values, the future values of AQI can be predicted, or the class/category value of the numeric value can be obtained. This forecast can be performed with more accuracy using supervised machine learning methods. In this study, multiple machine-learning approaches were used to classify PM2.5 values. The values for the pollutant PM2.5 were classified into different groups using machine learning algorithms such as logistic regression, support vector machines, random forest, extreme gradient boosting, and their grid search equivalents, along with the deep learning method multilayer perceptron. After performing multiclass classification using these algorithms, the parameters accuracy and per-class accuracy were used to compare the methods. As the dataset used was imbalanced, a SMOTE-based approach for balancing the dataset was used. Compared to all other classifiers that use the original dataset, the accuracy of the random forest multiclass classifier with SMOTE-based dataset balancing was found to provide better accuracy.
format Online
Article
Text
id pubmed-10289837
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-102898372023-06-24 Ensemble-based classification approach for PM2.5 concentration forecasting using meteorological data Saminathan, S. Malathy, C. Front Big Data Big Data Air pollution is a serious challenge to humankind as it poses many health threats. It can be measured using the air quality index (AQI). Air pollution is the result of contamination of both outdoor and indoor environments. The AQI is being monitored by various institutions globally. The measured air quality data are kept mostly for public use. Using the previously calculated AQI values, the future values of AQI can be predicted, or the class/category value of the numeric value can be obtained. This forecast can be performed with more accuracy using supervised machine learning methods. In this study, multiple machine-learning approaches were used to classify PM2.5 values. The values for the pollutant PM2.5 were classified into different groups using machine learning algorithms such as logistic regression, support vector machines, random forest, extreme gradient boosting, and their grid search equivalents, along with the deep learning method multilayer perceptron. After performing multiclass classification using these algorithms, the parameters accuracy and per-class accuracy were used to compare the methods. As the dataset used was imbalanced, a SMOTE-based approach for balancing the dataset was used. Compared to all other classifiers that use the original dataset, the accuracy of the random forest multiclass classifier with SMOTE-based dataset balancing was found to provide better accuracy. Frontiers Media S.A. 2023-06-09 /pmc/articles/PMC10289837/ /pubmed/37360751 http://dx.doi.org/10.3389/fdata.2023.1175259 Text en Copyright © 2023 Saminathan and Malathy. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Big Data
Saminathan, S.
Malathy, C.
Ensemble-based classification approach for PM2.5 concentration forecasting using meteorological data
title Ensemble-based classification approach for PM2.5 concentration forecasting using meteorological data
title_full Ensemble-based classification approach for PM2.5 concentration forecasting using meteorological data
title_fullStr Ensemble-based classification approach for PM2.5 concentration forecasting using meteorological data
title_full_unstemmed Ensemble-based classification approach for PM2.5 concentration forecasting using meteorological data
title_short Ensemble-based classification approach for PM2.5 concentration forecasting using meteorological data
title_sort ensemble-based classification approach for pm2.5 concentration forecasting using meteorological data
topic Big Data
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10289837/
https://www.ncbi.nlm.nih.gov/pubmed/37360751
http://dx.doi.org/10.3389/fdata.2023.1175259
work_keys_str_mv AT saminathans ensemblebasedclassificationapproachforpm25concentrationforecastingusingmeteorologicaldata
AT malathyc ensemblebasedclassificationapproachforpm25concentrationforecastingusingmeteorologicaldata