Cargando…

Imputation of missing daily rainfall data; A comparison between artificial intelligence and statistical techniques

Handling missing values is a critical component of the data processing in hydrological modeling. The key objective of this research is to assess statistical techniques (STs) and artificial intelligence-based techniques (AITs) for imputing missing daily rainfall values and recommend a methodology app...

Descripción completa

Detalles Bibliográficos
Autores principales: Wangwongchai, Angkool, Waqas, Muhammad, Dechpichai, Porntip, Hlaing, Phyo Thandar, Ahmad, Shakeel, Humphries, Usa Wannasingha
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10654590/
https://www.ncbi.nlm.nih.gov/pubmed/38023312
http://dx.doi.org/10.1016/j.mex.2023.102459
_version_ 1785136657450663936
author Wangwongchai, Angkool
Waqas, Muhammad
Dechpichai, Porntip
Hlaing, Phyo Thandar
Ahmad, Shakeel
Humphries, Usa Wannasingha
author_facet Wangwongchai, Angkool
Waqas, Muhammad
Dechpichai, Porntip
Hlaing, Phyo Thandar
Ahmad, Shakeel
Humphries, Usa Wannasingha
author_sort Wangwongchai, Angkool
collection PubMed
description Handling missing values is a critical component of the data processing in hydrological modeling. The key objective of this research is to assess statistical techniques (STs) and artificial intelligence-based techniques (AITs) for imputing missing daily rainfall values and recommend a methodology applicable to the mountainous terrain of northern Thailand. In this study, 30 years of daily rainfall data was collected from 20 rainfall stations in northern Thailand and randomly 25–35 % of data was deleted from four target stations based on Spearman correlation coefficient between the target and neighboring stations. Imputation models were developed on training and testing datasets and statistically evaluated by mean absolute error (MAE), root mean square error (RMSE), coefficient of determination (R(2)), and correlation coefficient (r). This study used STs, including arithmetic averaging (AA), multiple linear regression (MLR), normal-ratio (NR), nonlinear iterative partial least squares (NIPALS) algorithm, and linear interpolation was used. • STs results were compared with AITs, including long-short-term-memory recurrent neural network (LSTM-RNN), M5 model tree (M5-MT), multilayer perceptron neural networks (MLPNN), support vector regression with polynomial and radial basis function SVR-poly and SVR-RBF. • The findings revealed that MLR imputation model achieved an average MAE of 0.98, RMSE of 4.52, and R(2) was about 79.6 % at all target stations. On the other hand, for the M5-MT model, the average MAE was 0.91, RMSE was about 4.52, and R(2) was around 79.8 % compared to other STs and AITs. M5-MT was most prominent among AITs. Notably, the MLR technique stood out as a recommended approach due to its ability to deliver good estimation results while offering a transparent mechanism and not necessitating prior knowledge for model creation.
format Online
Article
Text
id pubmed-10654590
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-106545902023-10-27 Imputation of missing daily rainfall data; A comparison between artificial intelligence and statistical techniques Wangwongchai, Angkool Waqas, Muhammad Dechpichai, Porntip Hlaing, Phyo Thandar Ahmad, Shakeel Humphries, Usa Wannasingha MethodsX Engineering Handling missing values is a critical component of the data processing in hydrological modeling. The key objective of this research is to assess statistical techniques (STs) and artificial intelligence-based techniques (AITs) for imputing missing daily rainfall values and recommend a methodology applicable to the mountainous terrain of northern Thailand. In this study, 30 years of daily rainfall data was collected from 20 rainfall stations in northern Thailand and randomly 25–35 % of data was deleted from four target stations based on Spearman correlation coefficient between the target and neighboring stations. Imputation models were developed on training and testing datasets and statistically evaluated by mean absolute error (MAE), root mean square error (RMSE), coefficient of determination (R(2)), and correlation coefficient (r). This study used STs, including arithmetic averaging (AA), multiple linear regression (MLR), normal-ratio (NR), nonlinear iterative partial least squares (NIPALS) algorithm, and linear interpolation was used. • STs results were compared with AITs, including long-short-term-memory recurrent neural network (LSTM-RNN), M5 model tree (M5-MT), multilayer perceptron neural networks (MLPNN), support vector regression with polynomial and radial basis function SVR-poly and SVR-RBF. • The findings revealed that MLR imputation model achieved an average MAE of 0.98, RMSE of 4.52, and R(2) was about 79.6 % at all target stations. On the other hand, for the M5-MT model, the average MAE was 0.91, RMSE was about 4.52, and R(2) was around 79.8 % compared to other STs and AITs. M5-MT was most prominent among AITs. Notably, the MLR technique stood out as a recommended approach due to its ability to deliver good estimation results while offering a transparent mechanism and not necessitating prior knowledge for model creation. Elsevier 2023-10-27 /pmc/articles/PMC10654590/ /pubmed/38023312 http://dx.doi.org/10.1016/j.mex.2023.102459 Text en © 2023 The Authors. Published by Elsevier B.V. https://creativecommons.org/licenses/by/4.0/This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Engineering
Wangwongchai, Angkool
Waqas, Muhammad
Dechpichai, Porntip
Hlaing, Phyo Thandar
Ahmad, Shakeel
Humphries, Usa Wannasingha
Imputation of missing daily rainfall data; A comparison between artificial intelligence and statistical techniques
title Imputation of missing daily rainfall data; A comparison between artificial intelligence and statistical techniques
title_full Imputation of missing daily rainfall data; A comparison between artificial intelligence and statistical techniques
title_fullStr Imputation of missing daily rainfall data; A comparison between artificial intelligence and statistical techniques
title_full_unstemmed Imputation of missing daily rainfall data; A comparison between artificial intelligence and statistical techniques
title_short Imputation of missing daily rainfall data; A comparison between artificial intelligence and statistical techniques
title_sort imputation of missing daily rainfall data; a comparison between artificial intelligence and statistical techniques
topic Engineering
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10654590/
https://www.ncbi.nlm.nih.gov/pubmed/38023312
http://dx.doi.org/10.1016/j.mex.2023.102459
work_keys_str_mv AT wangwongchaiangkool imputationofmissingdailyrainfalldataacomparisonbetweenartificialintelligenceandstatisticaltechniques
AT waqasmuhammad imputationofmissingdailyrainfalldataacomparisonbetweenartificialintelligenceandstatisticaltechniques
AT dechpichaiporntip imputationofmissingdailyrainfalldataacomparisonbetweenartificialintelligenceandstatisticaltechniques
AT hlaingphyothandar imputationofmissingdailyrainfalldataacomparisonbetweenartificialintelligenceandstatisticaltechniques
AT ahmadshakeel imputationofmissingdailyrainfalldataacomparisonbetweenartificialintelligenceandstatisticaltechniques
AT humphriesusawannasingha imputationofmissingdailyrainfalldataacomparisonbetweenartificialintelligenceandstatisticaltechniques