Cargando…

An approach based on multivariate distribution and Gaussian copulas to predict groundwater quality using DNN models in a data scarce environment

Machine Learning models have become a fruitful tool in water resources modelling. However, it requires a significant amount of datasets for training and validation, which poses challenges in the analysis of data scarce environments, particularly for poorly monitored basins. In such scenarios, using...

Descripción completa

Detalles Bibliográficos
Autores principales: Nafii, Ayoub, Lamane, Houda, Taleb, Abdeslam, El Bilali, Ali
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9971125/
https://www.ncbi.nlm.nih.gov/pubmed/36865649
http://dx.doi.org/10.1016/j.mex.2023.102034
_version_ 1784898043356643328
author Nafii, Ayoub
Lamane, Houda
Taleb, Abdeslam
El Bilali, Ali
author_facet Nafii, Ayoub
Lamane, Houda
Taleb, Abdeslam
El Bilali, Ali
author_sort Nafii, Ayoub
collection PubMed
description Machine Learning models have become a fruitful tool in water resources modelling. However, it requires a significant amount of datasets for training and validation, which poses challenges in the analysis of data scarce environments, particularly for poorly monitored basins. In such scenarios, using Virtual Sample Generation (VSG) method is valuable to overcome this challenge in developing ML models. The main aim of this manuscript is to introduce a novel VSG based on multivariate distribution and Gaussian Copula called MVD-VSG whereby appropriate virtual combinations of groundwater quality parameters can be generated to train Deep Neural Network (DNN) for predicting Entropy Weighted Water Quality Index (EWQI) of aquifers even with small datasets. The MVD-VSG is original and was validated for its initial application using sufficient observed datasets collected from two aquifers. The validation results showed that from only 20 original samples, the MVD-VSG provided enough accuracy to predict EWQI with an NSE of 0.87. However the companion publication of this Method paper is El Bilali et al. [1]. • Development of MVD-VSG to generate virtual combinations of groundwater parameters in data scarce environment. • Training deep neural network to predict groundwater quality. • Validation of the method with sufficient observed datasets and sensitivity analysis.
format Online
Article
Text
id pubmed-9971125
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-99711252023-03-01 An approach based on multivariate distribution and Gaussian copulas to predict groundwater quality using DNN models in a data scarce environment Nafii, Ayoub Lamane, Houda Taleb, Abdeslam El Bilali, Ali MethodsX Method Article Machine Learning models have become a fruitful tool in water resources modelling. However, it requires a significant amount of datasets for training and validation, which poses challenges in the analysis of data scarce environments, particularly for poorly monitored basins. In such scenarios, using Virtual Sample Generation (VSG) method is valuable to overcome this challenge in developing ML models. The main aim of this manuscript is to introduce a novel VSG based on multivariate distribution and Gaussian Copula called MVD-VSG whereby appropriate virtual combinations of groundwater quality parameters can be generated to train Deep Neural Network (DNN) for predicting Entropy Weighted Water Quality Index (EWQI) of aquifers even with small datasets. The MVD-VSG is original and was validated for its initial application using sufficient observed datasets collected from two aquifers. The validation results showed that from only 20 original samples, the MVD-VSG provided enough accuracy to predict EWQI with an NSE of 0.87. However the companion publication of this Method paper is El Bilali et al. [1]. • Development of MVD-VSG to generate virtual combinations of groundwater parameters in data scarce environment. • Training deep neural network to predict groundwater quality. • Validation of the method with sufficient observed datasets and sensitivity analysis. Elsevier 2023-02-02 /pmc/articles/PMC9971125/ /pubmed/36865649 http://dx.doi.org/10.1016/j.mex.2023.102034 Text en © 2023 The Authors. Published by Elsevier B.V. https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Method Article
Nafii, Ayoub
Lamane, Houda
Taleb, Abdeslam
El Bilali, Ali
An approach based on multivariate distribution and Gaussian copulas to predict groundwater quality using DNN models in a data scarce environment
title An approach based on multivariate distribution and Gaussian copulas to predict groundwater quality using DNN models in a data scarce environment
title_full An approach based on multivariate distribution and Gaussian copulas to predict groundwater quality using DNN models in a data scarce environment
title_fullStr An approach based on multivariate distribution and Gaussian copulas to predict groundwater quality using DNN models in a data scarce environment
title_full_unstemmed An approach based on multivariate distribution and Gaussian copulas to predict groundwater quality using DNN models in a data scarce environment
title_short An approach based on multivariate distribution and Gaussian copulas to predict groundwater quality using DNN models in a data scarce environment
title_sort approach based on multivariate distribution and gaussian copulas to predict groundwater quality using dnn models in a data scarce environment
topic Method Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9971125/
https://www.ncbi.nlm.nih.gov/pubmed/36865649
http://dx.doi.org/10.1016/j.mex.2023.102034
work_keys_str_mv AT nafiiayoub anapproachbasedonmultivariatedistributionandgaussiancopulastopredictgroundwaterqualityusingdnnmodelsinadatascarceenvironment
AT lamanehouda anapproachbasedonmultivariatedistributionandgaussiancopulastopredictgroundwaterqualityusingdnnmodelsinadatascarceenvironment
AT talebabdeslam anapproachbasedonmultivariatedistributionandgaussiancopulastopredictgroundwaterqualityusingdnnmodelsinadatascarceenvironment
AT elbilaliali anapproachbasedonmultivariatedistributionandgaussiancopulastopredictgroundwaterqualityusingdnnmodelsinadatascarceenvironment
AT nafiiayoub approachbasedonmultivariatedistributionandgaussiancopulastopredictgroundwaterqualityusingdnnmodelsinadatascarceenvironment
AT lamanehouda approachbasedonmultivariatedistributionandgaussiancopulastopredictgroundwaterqualityusingdnnmodelsinadatascarceenvironment
AT talebabdeslam approachbasedonmultivariatedistributionandgaussiancopulastopredictgroundwaterqualityusingdnnmodelsinadatascarceenvironment
AT elbilaliali approachbasedonmultivariatedistributionandgaussiancopulastopredictgroundwaterqualityusingdnnmodelsinadatascarceenvironment