Cargando…

Comparison of Random Forest and Gradient Boosting Machine Models for Predicting Demolition Waste Based on Small Datasets and Categorical Variables

Construction and demolition waste (DW) generation information has been recognized as a tool for providing useful information for waste management. Recently, numerous researchers have actively utilized artificial intelligence technology to establish accurate waste generation information. This study i...

Descripción completa

Detalles Bibliográficos
Autores principales: Cha, Gi-Wook, Moon, Hyeun-Jun, Kim, Young-Chan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8392226/
https://www.ncbi.nlm.nih.gov/pubmed/34444277
http://dx.doi.org/10.3390/ijerph18168530
_version_ 1783743451625422848
author Cha, Gi-Wook
Moon, Hyeun-Jun
Kim, Young-Chan
author_facet Cha, Gi-Wook
Moon, Hyeun-Jun
Kim, Young-Chan
author_sort Cha, Gi-Wook
collection PubMed
description Construction and demolition waste (DW) generation information has been recognized as a tool for providing useful information for waste management. Recently, numerous researchers have actively utilized artificial intelligence technology to establish accurate waste generation information. This study investigated the development of machine learning predictive models that can achieve predictive performance on small datasets composed of categorical variables. To this end, the random forest (RF) and gradient boosting machine (GBM) algorithms were adopted. To develop the models, 690 building datasets were established using data preprocessing and standardization. Hyperparameter tuning was performed to develop the RF and GBM models. The model performances were evaluated using the leave-one-out cross-validation technique. The study demonstrated that, for small datasets comprising mainly categorical variables, the bagging technique (RF) predictions were more stable and accurate than those of the boosting technique (GBM). However, GBM models demonstrated excellent predictive performance in some DW predictive models. Furthermore, the RF and GBM predictive models demonstrated significantly differing performance across different types of DW. Certain RF and GBM models demonstrated relatively low predictive performance. However, the remaining predictive models all demonstrated excellent predictive performance at R(2) values > 0.6, and R values > 0.8. Such differences are mainly because of the characteristics of features applied to model development; we expect the application of additional features to improve the performance of the predictive models. The 11 DW predictive models developed in this study will be useful for establishing detailed DW management strategies.
format Online
Article
Text
id pubmed-8392226
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-83922262021-08-28 Comparison of Random Forest and Gradient Boosting Machine Models for Predicting Demolition Waste Based on Small Datasets and Categorical Variables Cha, Gi-Wook Moon, Hyeun-Jun Kim, Young-Chan Int J Environ Res Public Health Article Construction and demolition waste (DW) generation information has been recognized as a tool for providing useful information for waste management. Recently, numerous researchers have actively utilized artificial intelligence technology to establish accurate waste generation information. This study investigated the development of machine learning predictive models that can achieve predictive performance on small datasets composed of categorical variables. To this end, the random forest (RF) and gradient boosting machine (GBM) algorithms were adopted. To develop the models, 690 building datasets were established using data preprocessing and standardization. Hyperparameter tuning was performed to develop the RF and GBM models. The model performances were evaluated using the leave-one-out cross-validation technique. The study demonstrated that, for small datasets comprising mainly categorical variables, the bagging technique (RF) predictions were more stable and accurate than those of the boosting technique (GBM). However, GBM models demonstrated excellent predictive performance in some DW predictive models. Furthermore, the RF and GBM predictive models demonstrated significantly differing performance across different types of DW. Certain RF and GBM models demonstrated relatively low predictive performance. However, the remaining predictive models all demonstrated excellent predictive performance at R(2) values > 0.6, and R values > 0.8. Such differences are mainly because of the characteristics of features applied to model development; we expect the application of additional features to improve the performance of the predictive models. The 11 DW predictive models developed in this study will be useful for establishing detailed DW management strategies. MDPI 2021-08-12 /pmc/articles/PMC8392226/ /pubmed/34444277 http://dx.doi.org/10.3390/ijerph18168530 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Cha, Gi-Wook
Moon, Hyeun-Jun
Kim, Young-Chan
Comparison of Random Forest and Gradient Boosting Machine Models for Predicting Demolition Waste Based on Small Datasets and Categorical Variables
title Comparison of Random Forest and Gradient Boosting Machine Models for Predicting Demolition Waste Based on Small Datasets and Categorical Variables
title_full Comparison of Random Forest and Gradient Boosting Machine Models for Predicting Demolition Waste Based on Small Datasets and Categorical Variables
title_fullStr Comparison of Random Forest and Gradient Boosting Machine Models for Predicting Demolition Waste Based on Small Datasets and Categorical Variables
title_full_unstemmed Comparison of Random Forest and Gradient Boosting Machine Models for Predicting Demolition Waste Based on Small Datasets and Categorical Variables
title_short Comparison of Random Forest and Gradient Boosting Machine Models for Predicting Demolition Waste Based on Small Datasets and Categorical Variables
title_sort comparison of random forest and gradient boosting machine models for predicting demolition waste based on small datasets and categorical variables
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8392226/
https://www.ncbi.nlm.nih.gov/pubmed/34444277
http://dx.doi.org/10.3390/ijerph18168530
work_keys_str_mv AT chagiwook comparisonofrandomforestandgradientboostingmachinemodelsforpredictingdemolitionwastebasedonsmalldatasetsandcategoricalvariables
AT moonhyeunjun comparisonofrandomforestandgradientboostingmachinemodelsforpredictingdemolitionwastebasedonsmalldatasetsandcategoricalvariables
AT kimyoungchan comparisonofrandomforestandgradientboostingmachinemodelsforpredictingdemolitionwastebasedonsmalldatasetsandcategoricalvariables