Cargando…

A new approach for processing climate missing databases applied to daily rainfall data in Soummam watershed, Algeria

Missing data is a very frequent problem in climatology, it influences on the quality of results that will afford in hydrological studies, as well as water resources management. This paper proposes a new imputation algorithm, based on the optimization of some regression methods, which are hot deck, k...

Descripción completa

Detalles Bibliográficos
Autores principales: Aieb, Amir, Madani, Khodir, Scarpa, Marco, Bonaccorso, Brunella, Lefsih, Khalef
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6384304/
https://www.ncbi.nlm.nih.gov/pubmed/30886916
http://dx.doi.org/10.1016/j.heliyon.2019.e01247
_version_ 1783396971719950336
author Aieb, Amir
Madani, Khodir
Scarpa, Marco
Bonaccorso, Brunella
Lefsih, Khalef
author_facet Aieb, Amir
Madani, Khodir
Scarpa, Marco
Bonaccorso, Brunella
Lefsih, Khalef
author_sort Aieb, Amir
collection PubMed
description Missing data is a very frequent problem in climatology, it influences on the quality of results that will afford in hydrological studies, as well as water resources management. This paper proposes a new imputation algorithm, based on the optimization of some regression methods, which are hot deck, k-nearest-neighbors imputation, weighted k-nearest-neighbors imputation, multiple imputation, linear regression and simple average method. The choice of these methods was justified by qualitative and quantitative statistical tests analysis. However, the reliability of obtained results depends mainly on percentage of missing data, choice of neighboring stations and data missingness mechanism which should be missing at random. During the study it was found that the most of stations in Soummam watershed don't have a good correlation because the large loss in rainfall data or the geology of watershed which gives a relationship between station position and rainfall variability. For this case, principal component analysis is applied on a set of stations; it showed a positive impact of altitude, latitude and longitude on correlation index between selected stations. The graphical analysis of the normal law on RMSE values, which were obtained by applying the proposed technique in several random cases of missingness, that are 4%, 8%, 12% and 16% respectively, it confirmed the validity and the performance of this approach.
format Online
Article
Text
id pubmed-6384304
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-63843042019-03-18 A new approach for processing climate missing databases applied to daily rainfall data in Soummam watershed, Algeria Aieb, Amir Madani, Khodir Scarpa, Marco Bonaccorso, Brunella Lefsih, Khalef Heliyon Article Missing data is a very frequent problem in climatology, it influences on the quality of results that will afford in hydrological studies, as well as water resources management. This paper proposes a new imputation algorithm, based on the optimization of some regression methods, which are hot deck, k-nearest-neighbors imputation, weighted k-nearest-neighbors imputation, multiple imputation, linear regression and simple average method. The choice of these methods was justified by qualitative and quantitative statistical tests analysis. However, the reliability of obtained results depends mainly on percentage of missing data, choice of neighboring stations and data missingness mechanism which should be missing at random. During the study it was found that the most of stations in Soummam watershed don't have a good correlation because the large loss in rainfall data or the geology of watershed which gives a relationship between station position and rainfall variability. For this case, principal component analysis is applied on a set of stations; it showed a positive impact of altitude, latitude and longitude on correlation index between selected stations. The graphical analysis of the normal law on RMSE values, which were obtained by applying the proposed technique in several random cases of missingness, that are 4%, 8%, 12% and 16% respectively, it confirmed the validity and the performance of this approach. Elsevier 2019-02-21 /pmc/articles/PMC6384304/ /pubmed/30886916 http://dx.doi.org/10.1016/j.heliyon.2019.e01247 Text en © 2019 Published by Elsevier Ltd. http://creativecommons.org/licenses/by-nc-nd/4.0/ This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Article
Aieb, Amir
Madani, Khodir
Scarpa, Marco
Bonaccorso, Brunella
Lefsih, Khalef
A new approach for processing climate missing databases applied to daily rainfall data in Soummam watershed, Algeria
title A new approach for processing climate missing databases applied to daily rainfall data in Soummam watershed, Algeria
title_full A new approach for processing climate missing databases applied to daily rainfall data in Soummam watershed, Algeria
title_fullStr A new approach for processing climate missing databases applied to daily rainfall data in Soummam watershed, Algeria
title_full_unstemmed A new approach for processing climate missing databases applied to daily rainfall data in Soummam watershed, Algeria
title_short A new approach for processing climate missing databases applied to daily rainfall data in Soummam watershed, Algeria
title_sort new approach for processing climate missing databases applied to daily rainfall data in soummam watershed, algeria
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6384304/
https://www.ncbi.nlm.nih.gov/pubmed/30886916
http://dx.doi.org/10.1016/j.heliyon.2019.e01247
work_keys_str_mv AT aiebamir anewapproachforprocessingclimatemissingdatabasesappliedtodailyrainfalldatainsoummamwatershedalgeria
AT madanikhodir anewapproachforprocessingclimatemissingdatabasesappliedtodailyrainfalldatainsoummamwatershedalgeria
AT scarpamarco anewapproachforprocessingclimatemissingdatabasesappliedtodailyrainfalldatainsoummamwatershedalgeria
AT bonaccorsobrunella anewapproachforprocessingclimatemissingdatabasesappliedtodailyrainfalldatainsoummamwatershedalgeria
AT lefsihkhalef anewapproachforprocessingclimatemissingdatabasesappliedtodailyrainfalldatainsoummamwatershedalgeria
AT aiebamir newapproachforprocessingclimatemissingdatabasesappliedtodailyrainfalldatainsoummamwatershedalgeria
AT madanikhodir newapproachforprocessingclimatemissingdatabasesappliedtodailyrainfalldatainsoummamwatershedalgeria
AT scarpamarco newapproachforprocessingclimatemissingdatabasesappliedtodailyrainfalldatainsoummamwatershedalgeria
AT bonaccorsobrunella newapproachforprocessingclimatemissingdatabasesappliedtodailyrainfalldatainsoummamwatershedalgeria
AT lefsihkhalef newapproachforprocessingclimatemissingdatabasesappliedtodailyrainfalldatainsoummamwatershedalgeria