Cargando…
An approach based on multivariate distribution and Gaussian copulas to predict groundwater quality using DNN models in a data scarce environment
Machine Learning models have become a fruitful tool in water resources modelling. However, it requires a significant amount of datasets for training and validation, which poses challenges in the analysis of data scarce environments, particularly for poorly monitored basins. In such scenarios, using...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9971125/ https://www.ncbi.nlm.nih.gov/pubmed/36865649 http://dx.doi.org/10.1016/j.mex.2023.102034 |
_version_ | 1784898043356643328 |
---|---|
author | Nafii, Ayoub Lamane, Houda Taleb, Abdeslam El Bilali, Ali |
author_facet | Nafii, Ayoub Lamane, Houda Taleb, Abdeslam El Bilali, Ali |
author_sort | Nafii, Ayoub |
collection | PubMed |
description | Machine Learning models have become a fruitful tool in water resources modelling. However, it requires a significant amount of datasets for training and validation, which poses challenges in the analysis of data scarce environments, particularly for poorly monitored basins. In such scenarios, using Virtual Sample Generation (VSG) method is valuable to overcome this challenge in developing ML models. The main aim of this manuscript is to introduce a novel VSG based on multivariate distribution and Gaussian Copula called MVD-VSG whereby appropriate virtual combinations of groundwater quality parameters can be generated to train Deep Neural Network (DNN) for predicting Entropy Weighted Water Quality Index (EWQI) of aquifers even with small datasets. The MVD-VSG is original and was validated for its initial application using sufficient observed datasets collected from two aquifers. The validation results showed that from only 20 original samples, the MVD-VSG provided enough accuracy to predict EWQI with an NSE of 0.87. However the companion publication of this Method paper is El Bilali et al. [1]. • Development of MVD-VSG to generate virtual combinations of groundwater parameters in data scarce environment. • Training deep neural network to predict groundwater quality. • Validation of the method with sufficient observed datasets and sensitivity analysis. |
format | Online Article Text |
id | pubmed-9971125 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Elsevier |
record_format | MEDLINE/PubMed |
spelling | pubmed-99711252023-03-01 An approach based on multivariate distribution and Gaussian copulas to predict groundwater quality using DNN models in a data scarce environment Nafii, Ayoub Lamane, Houda Taleb, Abdeslam El Bilali, Ali MethodsX Method Article Machine Learning models have become a fruitful tool in water resources modelling. However, it requires a significant amount of datasets for training and validation, which poses challenges in the analysis of data scarce environments, particularly for poorly monitored basins. In such scenarios, using Virtual Sample Generation (VSG) method is valuable to overcome this challenge in developing ML models. The main aim of this manuscript is to introduce a novel VSG based on multivariate distribution and Gaussian Copula called MVD-VSG whereby appropriate virtual combinations of groundwater quality parameters can be generated to train Deep Neural Network (DNN) for predicting Entropy Weighted Water Quality Index (EWQI) of aquifers even with small datasets. The MVD-VSG is original and was validated for its initial application using sufficient observed datasets collected from two aquifers. The validation results showed that from only 20 original samples, the MVD-VSG provided enough accuracy to predict EWQI with an NSE of 0.87. However the companion publication of this Method paper is El Bilali et al. [1]. • Development of MVD-VSG to generate virtual combinations of groundwater parameters in data scarce environment. • Training deep neural network to predict groundwater quality. • Validation of the method with sufficient observed datasets and sensitivity analysis. Elsevier 2023-02-02 /pmc/articles/PMC9971125/ /pubmed/36865649 http://dx.doi.org/10.1016/j.mex.2023.102034 Text en © 2023 The Authors. Published by Elsevier B.V. https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). |
spellingShingle | Method Article Nafii, Ayoub Lamane, Houda Taleb, Abdeslam El Bilali, Ali An approach based on multivariate distribution and Gaussian copulas to predict groundwater quality using DNN models in a data scarce environment |
title | An approach based on multivariate distribution and Gaussian copulas to predict groundwater quality using DNN models in a data scarce environment |
title_full | An approach based on multivariate distribution and Gaussian copulas to predict groundwater quality using DNN models in a data scarce environment |
title_fullStr | An approach based on multivariate distribution and Gaussian copulas to predict groundwater quality using DNN models in a data scarce environment |
title_full_unstemmed | An approach based on multivariate distribution and Gaussian copulas to predict groundwater quality using DNN models in a data scarce environment |
title_short | An approach based on multivariate distribution and Gaussian copulas to predict groundwater quality using DNN models in a data scarce environment |
title_sort | approach based on multivariate distribution and gaussian copulas to predict groundwater quality using dnn models in a data scarce environment |
topic | Method Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9971125/ https://www.ncbi.nlm.nih.gov/pubmed/36865649 http://dx.doi.org/10.1016/j.mex.2023.102034 |
work_keys_str_mv | AT nafiiayoub anapproachbasedonmultivariatedistributionandgaussiancopulastopredictgroundwaterqualityusingdnnmodelsinadatascarceenvironment AT lamanehouda anapproachbasedonmultivariatedistributionandgaussiancopulastopredictgroundwaterqualityusingdnnmodelsinadatascarceenvironment AT talebabdeslam anapproachbasedonmultivariatedistributionandgaussiancopulastopredictgroundwaterqualityusingdnnmodelsinadatascarceenvironment AT elbilaliali anapproachbasedonmultivariatedistributionandgaussiancopulastopredictgroundwaterqualityusingdnnmodelsinadatascarceenvironment AT nafiiayoub approachbasedonmultivariatedistributionandgaussiancopulastopredictgroundwaterqualityusingdnnmodelsinadatascarceenvironment AT lamanehouda approachbasedonmultivariatedistributionandgaussiancopulastopredictgroundwaterqualityusingdnnmodelsinadatascarceenvironment AT talebabdeslam approachbasedonmultivariatedistributionandgaussiancopulastopredictgroundwaterqualityusingdnnmodelsinadatascarceenvironment AT elbilaliali approachbasedonmultivariatedistributionandgaussiancopulastopredictgroundwaterqualityusingdnnmodelsinadatascarceenvironment |