Cargando…
Comparison of Random Forest and Gradient Boosting Machine Models for Predicting Demolition Waste Based on Small Datasets and Categorical Variables
Construction and demolition waste (DW) generation information has been recognized as a tool for providing useful information for waste management. Recently, numerous researchers have actively utilized artificial intelligence technology to establish accurate waste generation information. This study i...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8392226/ https://www.ncbi.nlm.nih.gov/pubmed/34444277 http://dx.doi.org/10.3390/ijerph18168530 |
_version_ | 1783743451625422848 |
---|---|
author | Cha, Gi-Wook Moon, Hyeun-Jun Kim, Young-Chan |
author_facet | Cha, Gi-Wook Moon, Hyeun-Jun Kim, Young-Chan |
author_sort | Cha, Gi-Wook |
collection | PubMed |
description | Construction and demolition waste (DW) generation information has been recognized as a tool for providing useful information for waste management. Recently, numerous researchers have actively utilized artificial intelligence technology to establish accurate waste generation information. This study investigated the development of machine learning predictive models that can achieve predictive performance on small datasets composed of categorical variables. To this end, the random forest (RF) and gradient boosting machine (GBM) algorithms were adopted. To develop the models, 690 building datasets were established using data preprocessing and standardization. Hyperparameter tuning was performed to develop the RF and GBM models. The model performances were evaluated using the leave-one-out cross-validation technique. The study demonstrated that, for small datasets comprising mainly categorical variables, the bagging technique (RF) predictions were more stable and accurate than those of the boosting technique (GBM). However, GBM models demonstrated excellent predictive performance in some DW predictive models. Furthermore, the RF and GBM predictive models demonstrated significantly differing performance across different types of DW. Certain RF and GBM models demonstrated relatively low predictive performance. However, the remaining predictive models all demonstrated excellent predictive performance at R(2) values > 0.6, and R values > 0.8. Such differences are mainly because of the characteristics of features applied to model development; we expect the application of additional features to improve the performance of the predictive models. The 11 DW predictive models developed in this study will be useful for establishing detailed DW management strategies. |
format | Online Article Text |
id | pubmed-8392226 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-83922262021-08-28 Comparison of Random Forest and Gradient Boosting Machine Models for Predicting Demolition Waste Based on Small Datasets and Categorical Variables Cha, Gi-Wook Moon, Hyeun-Jun Kim, Young-Chan Int J Environ Res Public Health Article Construction and demolition waste (DW) generation information has been recognized as a tool for providing useful information for waste management. Recently, numerous researchers have actively utilized artificial intelligence technology to establish accurate waste generation information. This study investigated the development of machine learning predictive models that can achieve predictive performance on small datasets composed of categorical variables. To this end, the random forest (RF) and gradient boosting machine (GBM) algorithms were adopted. To develop the models, 690 building datasets were established using data preprocessing and standardization. Hyperparameter tuning was performed to develop the RF and GBM models. The model performances were evaluated using the leave-one-out cross-validation technique. The study demonstrated that, for small datasets comprising mainly categorical variables, the bagging technique (RF) predictions were more stable and accurate than those of the boosting technique (GBM). However, GBM models demonstrated excellent predictive performance in some DW predictive models. Furthermore, the RF and GBM predictive models demonstrated significantly differing performance across different types of DW. Certain RF and GBM models demonstrated relatively low predictive performance. However, the remaining predictive models all demonstrated excellent predictive performance at R(2) values > 0.6, and R values > 0.8. Such differences are mainly because of the characteristics of features applied to model development; we expect the application of additional features to improve the performance of the predictive models. The 11 DW predictive models developed in this study will be useful for establishing detailed DW management strategies. MDPI 2021-08-12 /pmc/articles/PMC8392226/ /pubmed/34444277 http://dx.doi.org/10.3390/ijerph18168530 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Cha, Gi-Wook Moon, Hyeun-Jun Kim, Young-Chan Comparison of Random Forest and Gradient Boosting Machine Models for Predicting Demolition Waste Based on Small Datasets and Categorical Variables |
title | Comparison of Random Forest and Gradient Boosting Machine Models for Predicting Demolition Waste Based on Small Datasets and Categorical Variables |
title_full | Comparison of Random Forest and Gradient Boosting Machine Models for Predicting Demolition Waste Based on Small Datasets and Categorical Variables |
title_fullStr | Comparison of Random Forest and Gradient Boosting Machine Models for Predicting Demolition Waste Based on Small Datasets and Categorical Variables |
title_full_unstemmed | Comparison of Random Forest and Gradient Boosting Machine Models for Predicting Demolition Waste Based on Small Datasets and Categorical Variables |
title_short | Comparison of Random Forest and Gradient Boosting Machine Models for Predicting Demolition Waste Based on Small Datasets and Categorical Variables |
title_sort | comparison of random forest and gradient boosting machine models for predicting demolition waste based on small datasets and categorical variables |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8392226/ https://www.ncbi.nlm.nih.gov/pubmed/34444277 http://dx.doi.org/10.3390/ijerph18168530 |
work_keys_str_mv | AT chagiwook comparisonofrandomforestandgradientboostingmachinemodelsforpredictingdemolitionwastebasedonsmalldatasetsandcategoricalvariables AT moonhyeunjun comparisonofrandomforestandgradientboostingmachinemodelsforpredictingdemolitionwastebasedonsmalldatasetsandcategoricalvariables AT kimyoungchan comparisonofrandomforestandgradientboostingmachinemodelsforpredictingdemolitionwastebasedonsmalldatasetsandcategoricalvariables |