Cargando…
A new approach for processing climate missing databases applied to daily rainfall data in Soummam watershed, Algeria
Missing data is a very frequent problem in climatology, it influences on the quality of results that will afford in hydrological studies, as well as water resources management. This paper proposes a new imputation algorithm, based on the optimization of some regression methods, which are hot deck, k...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier
2019
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6384304/ https://www.ncbi.nlm.nih.gov/pubmed/30886916 http://dx.doi.org/10.1016/j.heliyon.2019.e01247 |
_version_ | 1783396971719950336 |
---|---|
author | Aieb, Amir Madani, Khodir Scarpa, Marco Bonaccorso, Brunella Lefsih, Khalef |
author_facet | Aieb, Amir Madani, Khodir Scarpa, Marco Bonaccorso, Brunella Lefsih, Khalef |
author_sort | Aieb, Amir |
collection | PubMed |
description | Missing data is a very frequent problem in climatology, it influences on the quality of results that will afford in hydrological studies, as well as water resources management. This paper proposes a new imputation algorithm, based on the optimization of some regression methods, which are hot deck, k-nearest-neighbors imputation, weighted k-nearest-neighbors imputation, multiple imputation, linear regression and simple average method. The choice of these methods was justified by qualitative and quantitative statistical tests analysis. However, the reliability of obtained results depends mainly on percentage of missing data, choice of neighboring stations and data missingness mechanism which should be missing at random. During the study it was found that the most of stations in Soummam watershed don't have a good correlation because the large loss in rainfall data or the geology of watershed which gives a relationship between station position and rainfall variability. For this case, principal component analysis is applied on a set of stations; it showed a positive impact of altitude, latitude and longitude on correlation index between selected stations. The graphical analysis of the normal law on RMSE values, which were obtained by applying the proposed technique in several random cases of missingness, that are 4%, 8%, 12% and 16% respectively, it confirmed the validity and the performance of this approach. |
format | Online Article Text |
id | pubmed-6384304 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2019 |
publisher | Elsevier |
record_format | MEDLINE/PubMed |
spelling | pubmed-63843042019-03-18 A new approach for processing climate missing databases applied to daily rainfall data in Soummam watershed, Algeria Aieb, Amir Madani, Khodir Scarpa, Marco Bonaccorso, Brunella Lefsih, Khalef Heliyon Article Missing data is a very frequent problem in climatology, it influences on the quality of results that will afford in hydrological studies, as well as water resources management. This paper proposes a new imputation algorithm, based on the optimization of some regression methods, which are hot deck, k-nearest-neighbors imputation, weighted k-nearest-neighbors imputation, multiple imputation, linear regression and simple average method. The choice of these methods was justified by qualitative and quantitative statistical tests analysis. However, the reliability of obtained results depends mainly on percentage of missing data, choice of neighboring stations and data missingness mechanism which should be missing at random. During the study it was found that the most of stations in Soummam watershed don't have a good correlation because the large loss in rainfall data or the geology of watershed which gives a relationship between station position and rainfall variability. For this case, principal component analysis is applied on a set of stations; it showed a positive impact of altitude, latitude and longitude on correlation index between selected stations. The graphical analysis of the normal law on RMSE values, which were obtained by applying the proposed technique in several random cases of missingness, that are 4%, 8%, 12% and 16% respectively, it confirmed the validity and the performance of this approach. Elsevier 2019-02-21 /pmc/articles/PMC6384304/ /pubmed/30886916 http://dx.doi.org/10.1016/j.heliyon.2019.e01247 Text en © 2019 Published by Elsevier Ltd. http://creativecommons.org/licenses/by-nc-nd/4.0/ This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). |
spellingShingle | Article Aieb, Amir Madani, Khodir Scarpa, Marco Bonaccorso, Brunella Lefsih, Khalef A new approach for processing climate missing databases applied to daily rainfall data in Soummam watershed, Algeria |
title | A new approach for processing climate missing databases applied to daily rainfall data in Soummam watershed, Algeria |
title_full | A new approach for processing climate missing databases applied to daily rainfall data in Soummam watershed, Algeria |
title_fullStr | A new approach for processing climate missing databases applied to daily rainfall data in Soummam watershed, Algeria |
title_full_unstemmed | A new approach for processing climate missing databases applied to daily rainfall data in Soummam watershed, Algeria |
title_short | A new approach for processing climate missing databases applied to daily rainfall data in Soummam watershed, Algeria |
title_sort | new approach for processing climate missing databases applied to daily rainfall data in soummam watershed, algeria |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6384304/ https://www.ncbi.nlm.nih.gov/pubmed/30886916 http://dx.doi.org/10.1016/j.heliyon.2019.e01247 |
work_keys_str_mv | AT aiebamir anewapproachforprocessingclimatemissingdatabasesappliedtodailyrainfalldatainsoummamwatershedalgeria AT madanikhodir anewapproachforprocessingclimatemissingdatabasesappliedtodailyrainfalldatainsoummamwatershedalgeria AT scarpamarco anewapproachforprocessingclimatemissingdatabasesappliedtodailyrainfalldatainsoummamwatershedalgeria AT bonaccorsobrunella anewapproachforprocessingclimatemissingdatabasesappliedtodailyrainfalldatainsoummamwatershedalgeria AT lefsihkhalef anewapproachforprocessingclimatemissingdatabasesappliedtodailyrainfalldatainsoummamwatershedalgeria AT aiebamir newapproachforprocessingclimatemissingdatabasesappliedtodailyrainfalldatainsoummamwatershedalgeria AT madanikhodir newapproachforprocessingclimatemissingdatabasesappliedtodailyrainfalldatainsoummamwatershedalgeria AT scarpamarco newapproachforprocessingclimatemissingdatabasesappliedtodailyrainfalldatainsoummamwatershedalgeria AT bonaccorsobrunella newapproachforprocessingclimatemissingdatabasesappliedtodailyrainfalldatainsoummamwatershedalgeria AT lefsihkhalef newapproachforprocessingclimatemissingdatabasesappliedtodailyrainfalldatainsoummamwatershedalgeria |