Cargando…

Comparison of Missing Data Infilling Mechanisms for Recovering a Real-World Single Station Streamflow Observation

Reconstructing missing streamflow data can be challenging when additional data are not available, and missing data imputation of real-world datasets to investigate how to ascertain the accuracy of imputation algorithms for these datasets are lacking. This study investigated the necessary complexity...

Descripción completa

Detalles Bibliográficos
Autores principales: Baddoo, Thelma Dede, Li, Zhijia, Odai, Samuel Nii, Boni, Kenneth Rodolphe Chabi, Nooni, Isaac Kwesi, Andam-Akorful, Samuel Ato
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8394992/
https://www.ncbi.nlm.nih.gov/pubmed/34444127
http://dx.doi.org/10.3390/ijerph18168375
_version_ 1783744071073792000
author Baddoo, Thelma Dede
Li, Zhijia
Odai, Samuel Nii
Boni, Kenneth Rodolphe Chabi
Nooni, Isaac Kwesi
Andam-Akorful, Samuel Ato
author_facet Baddoo, Thelma Dede
Li, Zhijia
Odai, Samuel Nii
Boni, Kenneth Rodolphe Chabi
Nooni, Isaac Kwesi
Andam-Akorful, Samuel Ato
author_sort Baddoo, Thelma Dede
collection PubMed
description Reconstructing missing streamflow data can be challenging when additional data are not available, and missing data imputation of real-world datasets to investigate how to ascertain the accuracy of imputation algorithms for these datasets are lacking. This study investigated the necessary complexity of missing data reconstruction schemes to obtain the relevant results for a real-world single station streamflow observation to facilitate its further use. This investigation was implemented by applying different missing data mechanisms spanning from univariate algorithms to multiple imputation methods accustomed to multivariate data taking time as an explicit variable. The performance accuracy of these schemes was assessed using the total error measurement (TEM) and a recommended localized error measurement (LEM) in this study. The results show that univariate missing value algorithms, which are specially developed to handle univariate time series, provide satisfactory results, but the ones which provide the best results are usually time and computationally intensive. Also, multiple imputation algorithms which consider the surrounding observed values and/or which can understand the characteristics of the data provide similar results to the univariate missing data algorithms and, in some cases, perform better without the added time and computational downsides when time is taken as an explicit variable. Furthermore, the LEM would be especially useful when the missing data are in specific portions of the dataset or where very large gaps of ‘missingness’ occur. Finally, proper handling of missing values of real-world hydroclimatic datasets depends on imputing and extensive study of the particular dataset to be imputed.
format Online
Article
Text
id pubmed-8394992
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-83949922021-08-28 Comparison of Missing Data Infilling Mechanisms for Recovering a Real-World Single Station Streamflow Observation Baddoo, Thelma Dede Li, Zhijia Odai, Samuel Nii Boni, Kenneth Rodolphe Chabi Nooni, Isaac Kwesi Andam-Akorful, Samuel Ato Int J Environ Res Public Health Article Reconstructing missing streamflow data can be challenging when additional data are not available, and missing data imputation of real-world datasets to investigate how to ascertain the accuracy of imputation algorithms for these datasets are lacking. This study investigated the necessary complexity of missing data reconstruction schemes to obtain the relevant results for a real-world single station streamflow observation to facilitate its further use. This investigation was implemented by applying different missing data mechanisms spanning from univariate algorithms to multiple imputation methods accustomed to multivariate data taking time as an explicit variable. The performance accuracy of these schemes was assessed using the total error measurement (TEM) and a recommended localized error measurement (LEM) in this study. The results show that univariate missing value algorithms, which are specially developed to handle univariate time series, provide satisfactory results, but the ones which provide the best results are usually time and computationally intensive. Also, multiple imputation algorithms which consider the surrounding observed values and/or which can understand the characteristics of the data provide similar results to the univariate missing data algorithms and, in some cases, perform better without the added time and computational downsides when time is taken as an explicit variable. Furthermore, the LEM would be especially useful when the missing data are in specific portions of the dataset or where very large gaps of ‘missingness’ occur. Finally, proper handling of missing values of real-world hydroclimatic datasets depends on imputing and extensive study of the particular dataset to be imputed. MDPI 2021-08-07 /pmc/articles/PMC8394992/ /pubmed/34444127 http://dx.doi.org/10.3390/ijerph18168375 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Baddoo, Thelma Dede
Li, Zhijia
Odai, Samuel Nii
Boni, Kenneth Rodolphe Chabi
Nooni, Isaac Kwesi
Andam-Akorful, Samuel Ato
Comparison of Missing Data Infilling Mechanisms for Recovering a Real-World Single Station Streamflow Observation
title Comparison of Missing Data Infilling Mechanisms for Recovering a Real-World Single Station Streamflow Observation
title_full Comparison of Missing Data Infilling Mechanisms for Recovering a Real-World Single Station Streamflow Observation
title_fullStr Comparison of Missing Data Infilling Mechanisms for Recovering a Real-World Single Station Streamflow Observation
title_full_unstemmed Comparison of Missing Data Infilling Mechanisms for Recovering a Real-World Single Station Streamflow Observation
title_short Comparison of Missing Data Infilling Mechanisms for Recovering a Real-World Single Station Streamflow Observation
title_sort comparison of missing data infilling mechanisms for recovering a real-world single station streamflow observation
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8394992/
https://www.ncbi.nlm.nih.gov/pubmed/34444127
http://dx.doi.org/10.3390/ijerph18168375
work_keys_str_mv AT baddoothelmadede comparisonofmissingdatainfillingmechanismsforrecoveringarealworldsinglestationstreamflowobservation
AT lizhijia comparisonofmissingdatainfillingmechanismsforrecoveringarealworldsinglestationstreamflowobservation
AT odaisamuelnii comparisonofmissingdatainfillingmechanismsforrecoveringarealworldsinglestationstreamflowobservation
AT bonikennethrodolphechabi comparisonofmissingdatainfillingmechanismsforrecoveringarealworldsinglestationstreamflowobservation
AT nooniisaackwesi comparisonofmissingdatainfillingmechanismsforrecoveringarealworldsinglestationstreamflowobservation
AT andamakorfulsamuelato comparisonofmissingdatainfillingmechanismsforrecoveringarealworldsinglestationstreamflowobservation