Cargando…
Comparison of Missing Data Infilling Mechanisms for Recovering a Real-World Single Station Streamflow Observation
Reconstructing missing streamflow data can be challenging when additional data are not available, and missing data imputation of real-world datasets to investigate how to ascertain the accuracy of imputation algorithms for these datasets are lacking. This study investigated the necessary complexity...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8394992/ https://www.ncbi.nlm.nih.gov/pubmed/34444127 http://dx.doi.org/10.3390/ijerph18168375 |
_version_ | 1783744071073792000 |
---|---|
author | Baddoo, Thelma Dede Li, Zhijia Odai, Samuel Nii Boni, Kenneth Rodolphe Chabi Nooni, Isaac Kwesi Andam-Akorful, Samuel Ato |
author_facet | Baddoo, Thelma Dede Li, Zhijia Odai, Samuel Nii Boni, Kenneth Rodolphe Chabi Nooni, Isaac Kwesi Andam-Akorful, Samuel Ato |
author_sort | Baddoo, Thelma Dede |
collection | PubMed |
description | Reconstructing missing streamflow data can be challenging when additional data are not available, and missing data imputation of real-world datasets to investigate how to ascertain the accuracy of imputation algorithms for these datasets are lacking. This study investigated the necessary complexity of missing data reconstruction schemes to obtain the relevant results for a real-world single station streamflow observation to facilitate its further use. This investigation was implemented by applying different missing data mechanisms spanning from univariate algorithms to multiple imputation methods accustomed to multivariate data taking time as an explicit variable. The performance accuracy of these schemes was assessed using the total error measurement (TEM) and a recommended localized error measurement (LEM) in this study. The results show that univariate missing value algorithms, which are specially developed to handle univariate time series, provide satisfactory results, but the ones which provide the best results are usually time and computationally intensive. Also, multiple imputation algorithms which consider the surrounding observed values and/or which can understand the characteristics of the data provide similar results to the univariate missing data algorithms and, in some cases, perform better without the added time and computational downsides when time is taken as an explicit variable. Furthermore, the LEM would be especially useful when the missing data are in specific portions of the dataset or where very large gaps of ‘missingness’ occur. Finally, proper handling of missing values of real-world hydroclimatic datasets depends on imputing and extensive study of the particular dataset to be imputed. |
format | Online Article Text |
id | pubmed-8394992 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-83949922021-08-28 Comparison of Missing Data Infilling Mechanisms for Recovering a Real-World Single Station Streamflow Observation Baddoo, Thelma Dede Li, Zhijia Odai, Samuel Nii Boni, Kenneth Rodolphe Chabi Nooni, Isaac Kwesi Andam-Akorful, Samuel Ato Int J Environ Res Public Health Article Reconstructing missing streamflow data can be challenging when additional data are not available, and missing data imputation of real-world datasets to investigate how to ascertain the accuracy of imputation algorithms for these datasets are lacking. This study investigated the necessary complexity of missing data reconstruction schemes to obtain the relevant results for a real-world single station streamflow observation to facilitate its further use. This investigation was implemented by applying different missing data mechanisms spanning from univariate algorithms to multiple imputation methods accustomed to multivariate data taking time as an explicit variable. The performance accuracy of these schemes was assessed using the total error measurement (TEM) and a recommended localized error measurement (LEM) in this study. The results show that univariate missing value algorithms, which are specially developed to handle univariate time series, provide satisfactory results, but the ones which provide the best results are usually time and computationally intensive. Also, multiple imputation algorithms which consider the surrounding observed values and/or which can understand the characteristics of the data provide similar results to the univariate missing data algorithms and, in some cases, perform better without the added time and computational downsides when time is taken as an explicit variable. Furthermore, the LEM would be especially useful when the missing data are in specific portions of the dataset or where very large gaps of ‘missingness’ occur. Finally, proper handling of missing values of real-world hydroclimatic datasets depends on imputing and extensive study of the particular dataset to be imputed. MDPI 2021-08-07 /pmc/articles/PMC8394992/ /pubmed/34444127 http://dx.doi.org/10.3390/ijerph18168375 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Baddoo, Thelma Dede Li, Zhijia Odai, Samuel Nii Boni, Kenneth Rodolphe Chabi Nooni, Isaac Kwesi Andam-Akorful, Samuel Ato Comparison of Missing Data Infilling Mechanisms for Recovering a Real-World Single Station Streamflow Observation |
title | Comparison of Missing Data Infilling Mechanisms for Recovering a Real-World Single Station Streamflow Observation |
title_full | Comparison of Missing Data Infilling Mechanisms for Recovering a Real-World Single Station Streamflow Observation |
title_fullStr | Comparison of Missing Data Infilling Mechanisms for Recovering a Real-World Single Station Streamflow Observation |
title_full_unstemmed | Comparison of Missing Data Infilling Mechanisms for Recovering a Real-World Single Station Streamflow Observation |
title_short | Comparison of Missing Data Infilling Mechanisms for Recovering a Real-World Single Station Streamflow Observation |
title_sort | comparison of missing data infilling mechanisms for recovering a real-world single station streamflow observation |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8394992/ https://www.ncbi.nlm.nih.gov/pubmed/34444127 http://dx.doi.org/10.3390/ijerph18168375 |
work_keys_str_mv | AT baddoothelmadede comparisonofmissingdatainfillingmechanismsforrecoveringarealworldsinglestationstreamflowobservation AT lizhijia comparisonofmissingdatainfillingmechanismsforrecoveringarealworldsinglestationstreamflowobservation AT odaisamuelnii comparisonofmissingdatainfillingmechanismsforrecoveringarealworldsinglestationstreamflowobservation AT bonikennethrodolphechabi comparisonofmissingdatainfillingmechanismsforrecoveringarealworldsinglestationstreamflowobservation AT nooniisaackwesi comparisonofmissingdatainfillingmechanismsforrecoveringarealworldsinglestationstreamflowobservation AT andamakorfulsamuelato comparisonofmissingdatainfillingmechanismsforrecoveringarealworldsinglestationstreamflowobservation |