Cargando…

A Machine Learning Approach to Predict Watershed Health Indices for Sediments and Nutrients at Ungauged Basins

Effective water quality management and reliable environmental modeling depend on the availability, size, and quality of water quality (WQ) data. Observed stream water quality data are usuallEEy sparse in both time and space. Reconstruction of water quality time series using surrogate variables such...

Descripción completa

Detalles Bibliográficos
Autores principales: Mallya, Ganeshchandra, Hantush, Mohamed M., Govindaraju, Rao S.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10259765/
https://www.ncbi.nlm.nih.gov/pubmed/37309416
http://dx.doi.org/10.3390/w15030586
_version_ 1785057719154114560
author Mallya, Ganeshchandra
Hantush, Mohamed M.
Govindaraju, Rao S.
author_facet Mallya, Ganeshchandra
Hantush, Mohamed M.
Govindaraju, Rao S.
author_sort Mallya, Ganeshchandra
collection PubMed
description Effective water quality management and reliable environmental modeling depend on the availability, size, and quality of water quality (WQ) data. Observed stream water quality data are usuallEEy sparse in both time and space. Reconstruction of water quality time series using surrogate variables such as streamflow have been used to evaluate risk metrics such as reliability, resilience, vulnerability, and watershed health (WH) but only at gauged locations. Estimating these indices for ungauged watersheds has not been attempted because of the high-dimensional nature of the potential predictor space. In this study, machine learning (ML) models, namely random forest regression, AdaBoost, gradient boosting machines, and Bayesian ridge regression (along with an ensemble model), were evaluated to predict watershed health and other risk metrics at ungauged hydrologic unit code 10 (HUC-10) basins using watershed attributes, long-term climate data, soil data, land use and land cover data, fertilizer sales data, and geographic information as predictor variables. These ML models were tested over the Upper Mississippi River Basin, the Ohio River Basin, and the Maumee River Basin for water quality constituents such as suspended sediment concentration, nitrogen, and phosphorus. Random forest, AdaBoost, and gradient boosting regressors typically showed a coefficient of determination [Formula: see text] for suspended sediment concentration and nitrogen during the testing stage, while the ensemble model exhibited [Formula: see text]. Watershed health values with respect to suspended sediments and nitrogen predicted by all ML models including the ensemble model were lower for areas with larger agricultural land use, moderate for areas with predominant urban land use, and higher for forested areas; the trained ML models adequately predicted WH in ungauged basins. However, low WH values (with respect to phosphorus) were predicted at some basins in the Upper Mississippi River Basin that had dominant forest land use. Results suggest that the proposed ML models provide robust estimates at ungauged locations when sufficient training data are available for a WQ constituent. ML models may be used as quick screening tools by decision makers and water quality monitoring agencies for identifying critical source areas or hotspots with respect to different water quality constituents, even for ungauged watersheds.
format Online
Article
Text
id pubmed-10259765
institution National Center for Biotechnology Information
language English
publishDate 2023
record_format MEDLINE/PubMed
spelling pubmed-102597652023-06-12 A Machine Learning Approach to Predict Watershed Health Indices for Sediments and Nutrients at Ungauged Basins Mallya, Ganeshchandra Hantush, Mohamed M. Govindaraju, Rao S. Water (Basel) Article Effective water quality management and reliable environmental modeling depend on the availability, size, and quality of water quality (WQ) data. Observed stream water quality data are usuallEEy sparse in both time and space. Reconstruction of water quality time series using surrogate variables such as streamflow have been used to evaluate risk metrics such as reliability, resilience, vulnerability, and watershed health (WH) but only at gauged locations. Estimating these indices for ungauged watersheds has not been attempted because of the high-dimensional nature of the potential predictor space. In this study, machine learning (ML) models, namely random forest regression, AdaBoost, gradient boosting machines, and Bayesian ridge regression (along with an ensemble model), were evaluated to predict watershed health and other risk metrics at ungauged hydrologic unit code 10 (HUC-10) basins using watershed attributes, long-term climate data, soil data, land use and land cover data, fertilizer sales data, and geographic information as predictor variables. These ML models were tested over the Upper Mississippi River Basin, the Ohio River Basin, and the Maumee River Basin for water quality constituents such as suspended sediment concentration, nitrogen, and phosphorus. Random forest, AdaBoost, and gradient boosting regressors typically showed a coefficient of determination [Formula: see text] for suspended sediment concentration and nitrogen during the testing stage, while the ensemble model exhibited [Formula: see text]. Watershed health values with respect to suspended sediments and nitrogen predicted by all ML models including the ensemble model were lower for areas with larger agricultural land use, moderate for areas with predominant urban land use, and higher for forested areas; the trained ML models adequately predicted WH in ungauged basins. However, low WH values (with respect to phosphorus) were predicted at some basins in the Upper Mississippi River Basin that had dominant forest land use. Results suggest that the proposed ML models provide robust estimates at ungauged locations when sufficient training data are available for a WQ constituent. ML models may be used as quick screening tools by decision makers and water quality monitoring agencies for identifying critical source areas or hotspots with respect to different water quality constituents, even for ungauged watersheds. 2023 /pmc/articles/PMC10259765/ /pubmed/37309416 http://dx.doi.org/10.3390/w15030586 Text en https://creativecommons.org/licenses/by/4.0/This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/)
spellingShingle Article
Mallya, Ganeshchandra
Hantush, Mohamed M.
Govindaraju, Rao S.
A Machine Learning Approach to Predict Watershed Health Indices for Sediments and Nutrients at Ungauged Basins
title A Machine Learning Approach to Predict Watershed Health Indices for Sediments and Nutrients at Ungauged Basins
title_full A Machine Learning Approach to Predict Watershed Health Indices for Sediments and Nutrients at Ungauged Basins
title_fullStr A Machine Learning Approach to Predict Watershed Health Indices for Sediments and Nutrients at Ungauged Basins
title_full_unstemmed A Machine Learning Approach to Predict Watershed Health Indices for Sediments and Nutrients at Ungauged Basins
title_short A Machine Learning Approach to Predict Watershed Health Indices for Sediments and Nutrients at Ungauged Basins
title_sort machine learning approach to predict watershed health indices for sediments and nutrients at ungauged basins
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10259765/
https://www.ncbi.nlm.nih.gov/pubmed/37309416
http://dx.doi.org/10.3390/w15030586
work_keys_str_mv AT mallyaganeshchandra amachinelearningapproachtopredictwatershedhealthindicesforsedimentsandnutrientsatungaugedbasins
AT hantushmohamedm amachinelearningapproachtopredictwatershedhealthindicesforsedimentsandnutrientsatungaugedbasins
AT govindarajuraos amachinelearningapproachtopredictwatershedhealthindicesforsedimentsandnutrientsatungaugedbasins
AT mallyaganeshchandra machinelearningapproachtopredictwatershedhealthindicesforsedimentsandnutrientsatungaugedbasins
AT hantushmohamedm machinelearningapproachtopredictwatershedhealthindicesforsedimentsandnutrientsatungaugedbasins
AT govindarajuraos machinelearningapproachtopredictwatershedhealthindicesforsedimentsandnutrientsatungaugedbasins