Cargando…

Prediction of Cavity Length Using an Interpretable Ensemble Learning Approach

The cavity length, which is a vital index in aeration and corrosion reduction engineering, is affected by many factors and is challenging to calculate. In this study, 10-fold cross-validation was performed to select the optimal input configuration. Additionally, the hyperparameters of three ensemble...

Descripción completa

Detalles Bibliográficos
Autores principales:	Guo, Ganggui, Li, Shanshan, Liu, Yakun, Cao, Ze, Deng, Yangyu
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2022
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9819684/ https://www.ncbi.nlm.nih.gov/pubmed/36613022 http://dx.doi.org/10.3390/ijerph20010702

_version_	1784865288472231936
author	Guo, Ganggui Li, Shanshan Liu, Yakun Cao, Ze Deng, Yangyu
author_facet	Guo, Ganggui Li, Shanshan Liu, Yakun Cao, Ze Deng, Yangyu
author_sort	Guo, Ganggui
collection	PubMed
description	The cavity length, which is a vital index in aeration and corrosion reduction engineering, is affected by many factors and is challenging to calculate. In this study, 10-fold cross-validation was performed to select the optimal input configuration. Additionally, the hyperparameters of three ensemble learning models—random forest (RF), gradient boosting decision tree (GBDT), and extreme gradient boosting tree (XGBOOST)—were fine-tuned by the Bayesian optimization (BO) algorithm to improve the prediction accuracy and compare the five empirical methods. The XGBOOST method was observed to present the highest prediction accuracy. Further interpretability analysis carried out using the Sobol method demonstrated its ability to reasonably capture the varying relative significance of different input features under different flow conditions. The Sobol sensitivity analysis also observed two patterns of extracting information from the input features in ML models: (1) the main effect of individual features in ensemble learning and (2) the interactive effect between each feature in SVR. From the results, the models obtaining individual information both predict the cavity length more accurately than that using interactive information. Subsequently, the XGBOOST captures more correct information from features, which leads to the varied Sobol index in accordance with outside phenomena; meanwhile, the predicted results fit the experimental points best.
format	Online Article Text
id	pubmed-9819684
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-98196842023-01-07 Prediction of Cavity Length Using an Interpretable Ensemble Learning Approach Guo, Ganggui Li, Shanshan Liu, Yakun Cao, Ze Deng, Yangyu Int J Environ Res Public Health Article The cavity length, which is a vital index in aeration and corrosion reduction engineering, is affected by many factors and is challenging to calculate. In this study, 10-fold cross-validation was performed to select the optimal input configuration. Additionally, the hyperparameters of three ensemble learning models—random forest (RF), gradient boosting decision tree (GBDT), and extreme gradient boosting tree (XGBOOST)—were fine-tuned by the Bayesian optimization (BO) algorithm to improve the prediction accuracy and compare the five empirical methods. The XGBOOST method was observed to present the highest prediction accuracy. Further interpretability analysis carried out using the Sobol method demonstrated its ability to reasonably capture the varying relative significance of different input features under different flow conditions. The Sobol sensitivity analysis also observed two patterns of extracting information from the input features in ML models: (1) the main effect of individual features in ensemble learning and (2) the interactive effect between each feature in SVR. From the results, the models obtaining individual information both predict the cavity length more accurately than that using interactive information. Subsequently, the XGBOOST captures more correct information from features, which leads to the varied Sobol index in accordance with outside phenomena; meanwhile, the predicted results fit the experimental points best. MDPI 2022-12-30 /pmc/articles/PMC9819684/ /pubmed/36613022 http://dx.doi.org/10.3390/ijerph20010702 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Guo, Ganggui Li, Shanshan Liu, Yakun Cao, Ze Deng, Yangyu Prediction of Cavity Length Using an Interpretable Ensemble Learning Approach
title	Prediction of Cavity Length Using an Interpretable Ensemble Learning Approach
title_full	Prediction of Cavity Length Using an Interpretable Ensemble Learning Approach
title_fullStr	Prediction of Cavity Length Using an Interpretable Ensemble Learning Approach
title_full_unstemmed	Prediction of Cavity Length Using an Interpretable Ensemble Learning Approach
title_short	Prediction of Cavity Length Using an Interpretable Ensemble Learning Approach
title_sort	prediction of cavity length using an interpretable ensemble learning approach
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9819684/ https://www.ncbi.nlm.nih.gov/pubmed/36613022 http://dx.doi.org/10.3390/ijerph20010702
work_keys_str_mv	AT guoganggui predictionofcavitylengthusinganinterpretableensemblelearningapproach AT lishanshan predictionofcavitylengthusinganinterpretableensemblelearningapproach AT liuyakun predictionofcavitylengthusinganinterpretableensemblelearningapproach AT caoze predictionofcavitylengthusinganinterpretableensemblelearningapproach AT dengyangyu predictionofcavitylengthusinganinterpretableensemblelearningapproach

Prediction of Cavity Length Using an Interpretable Ensemble Learning Approach

Ejemplares similares