Cargando…
Ensemble-based classification approach for PM2.5 concentration forecasting using meteorological data
Air pollution is a serious challenge to humankind as it poses many health threats. It can be measured using the air quality index (AQI). Air pollution is the result of contamination of both outdoor and indoor environments. The AQI is being monitored by various institutions globally. The measured air...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10289837/ https://www.ncbi.nlm.nih.gov/pubmed/37360751 http://dx.doi.org/10.3389/fdata.2023.1175259 |
_version_ | 1785062364816605184 |
---|---|
author | Saminathan, S. Malathy, C. |
author_facet | Saminathan, S. Malathy, C. |
author_sort | Saminathan, S. |
collection | PubMed |
description | Air pollution is a serious challenge to humankind as it poses many health threats. It can be measured using the air quality index (AQI). Air pollution is the result of contamination of both outdoor and indoor environments. The AQI is being monitored by various institutions globally. The measured air quality data are kept mostly for public use. Using the previously calculated AQI values, the future values of AQI can be predicted, or the class/category value of the numeric value can be obtained. This forecast can be performed with more accuracy using supervised machine learning methods. In this study, multiple machine-learning approaches were used to classify PM2.5 values. The values for the pollutant PM2.5 were classified into different groups using machine learning algorithms such as logistic regression, support vector machines, random forest, extreme gradient boosting, and their grid search equivalents, along with the deep learning method multilayer perceptron. After performing multiclass classification using these algorithms, the parameters accuracy and per-class accuracy were used to compare the methods. As the dataset used was imbalanced, a SMOTE-based approach for balancing the dataset was used. Compared to all other classifiers that use the original dataset, the accuracy of the random forest multiclass classifier with SMOTE-based dataset balancing was found to provide better accuracy. |
format | Online Article Text |
id | pubmed-10289837 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-102898372023-06-24 Ensemble-based classification approach for PM2.5 concentration forecasting using meteorological data Saminathan, S. Malathy, C. Front Big Data Big Data Air pollution is a serious challenge to humankind as it poses many health threats. It can be measured using the air quality index (AQI). Air pollution is the result of contamination of both outdoor and indoor environments. The AQI is being monitored by various institutions globally. The measured air quality data are kept mostly for public use. Using the previously calculated AQI values, the future values of AQI can be predicted, or the class/category value of the numeric value can be obtained. This forecast can be performed with more accuracy using supervised machine learning methods. In this study, multiple machine-learning approaches were used to classify PM2.5 values. The values for the pollutant PM2.5 were classified into different groups using machine learning algorithms such as logistic regression, support vector machines, random forest, extreme gradient boosting, and their grid search equivalents, along with the deep learning method multilayer perceptron. After performing multiclass classification using these algorithms, the parameters accuracy and per-class accuracy were used to compare the methods. As the dataset used was imbalanced, a SMOTE-based approach for balancing the dataset was used. Compared to all other classifiers that use the original dataset, the accuracy of the random forest multiclass classifier with SMOTE-based dataset balancing was found to provide better accuracy. Frontiers Media S.A. 2023-06-09 /pmc/articles/PMC10289837/ /pubmed/37360751 http://dx.doi.org/10.3389/fdata.2023.1175259 Text en Copyright © 2023 Saminathan and Malathy. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Big Data Saminathan, S. Malathy, C. Ensemble-based classification approach for PM2.5 concentration forecasting using meteorological data |
title | Ensemble-based classification approach for PM2.5 concentration forecasting using meteorological data |
title_full | Ensemble-based classification approach for PM2.5 concentration forecasting using meteorological data |
title_fullStr | Ensemble-based classification approach for PM2.5 concentration forecasting using meteorological data |
title_full_unstemmed | Ensemble-based classification approach for PM2.5 concentration forecasting using meteorological data |
title_short | Ensemble-based classification approach for PM2.5 concentration forecasting using meteorological data |
title_sort | ensemble-based classification approach for pm2.5 concentration forecasting using meteorological data |
topic | Big Data |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10289837/ https://www.ncbi.nlm.nih.gov/pubmed/37360751 http://dx.doi.org/10.3389/fdata.2023.1175259 |
work_keys_str_mv | AT saminathans ensemblebasedclassificationapproachforpm25concentrationforecastingusingmeteorologicaldata AT malathyc ensemblebasedclassificationapproachforpm25concentrationforecastingusingmeteorologicaldata |