Cargando…

Determination of Optimal Predictors and Sampling Frequency to Develop Nutrient Soft Sensors Using Random Forest

Despite advancements in sensor technology, monitoring nutrients in situ and in real-time is still challenging and expensive. Soft sensors, based on data-driven models, offer an alternative to direct nutrient measurements. However, the high demand for data required for their development poses logisti...

Descripción completa

Detalles Bibliográficos
Autores principales: Arhab, Muhammad, Huang, Jingshui
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10346477/
https://www.ncbi.nlm.nih.gov/pubmed/37447905
http://dx.doi.org/10.3390/s23136057
_version_ 1785073322846846976
author Arhab, Muhammad
Huang, Jingshui
author_facet Arhab, Muhammad
Huang, Jingshui
author_sort Arhab, Muhammad
collection PubMed
description Despite advancements in sensor technology, monitoring nutrients in situ and in real-time is still challenging and expensive. Soft sensors, based on data-driven models, offer an alternative to direct nutrient measurements. However, the high demand for data required for their development poses logistical issues with data handling. To address this, the study aimed to determine the optimal subset of predictors and the sampling frequency for developing nutrient soft sensors using random forest. The study used water quality data at 15-min intervals from 2 automatic stations on the Main River, Germany, and included dissolved oxygen, temperature, conductivity, pH, streamflow, and cyclical time features as predictors. The optimal subset of predictors was identified using forward subset selection, and the models fitted with the optimal predictors produced R(2) values above 0.95 for nitrate, orthophosphate, and ammonium for both stations. The study then trained the models on 40 sampling frequencies, ranging from monthly to 15-min intervals. The results showed that as the sampling frequency increased, the model’s performance, measured by RMSE, improved. The optimal balance between sampling frequency and model performance was identified using a knee-point determination algorithm. The optimal sampling frequency for nitrate was 3.6 and 2.8 h for the 2 stations, respectively. For orthophosphate, it was 2.4 and 1.8 h. For ammonium, it was 2.2 h for 1 station. The study highlights the utility of surrogate models for monitoring nutrient levels and demonstrates that nutrient soft sensors can function with fewer predictors at lower frequencies without significantly decreasing performance.
format Online
Article
Text
id pubmed-10346477
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-103464772023-07-15 Determination of Optimal Predictors and Sampling Frequency to Develop Nutrient Soft Sensors Using Random Forest Arhab, Muhammad Huang, Jingshui Sensors (Basel) Article Despite advancements in sensor technology, monitoring nutrients in situ and in real-time is still challenging and expensive. Soft sensors, based on data-driven models, offer an alternative to direct nutrient measurements. However, the high demand for data required for their development poses logistical issues with data handling. To address this, the study aimed to determine the optimal subset of predictors and the sampling frequency for developing nutrient soft sensors using random forest. The study used water quality data at 15-min intervals from 2 automatic stations on the Main River, Germany, and included dissolved oxygen, temperature, conductivity, pH, streamflow, and cyclical time features as predictors. The optimal subset of predictors was identified using forward subset selection, and the models fitted with the optimal predictors produced R(2) values above 0.95 for nitrate, orthophosphate, and ammonium for both stations. The study then trained the models on 40 sampling frequencies, ranging from monthly to 15-min intervals. The results showed that as the sampling frequency increased, the model’s performance, measured by RMSE, improved. The optimal balance between sampling frequency and model performance was identified using a knee-point determination algorithm. The optimal sampling frequency for nitrate was 3.6 and 2.8 h for the 2 stations, respectively. For orthophosphate, it was 2.4 and 1.8 h. For ammonium, it was 2.2 h for 1 station. The study highlights the utility of surrogate models for monitoring nutrient levels and demonstrates that nutrient soft sensors can function with fewer predictors at lower frequencies without significantly decreasing performance. MDPI 2023-06-30 /pmc/articles/PMC10346477/ /pubmed/37447905 http://dx.doi.org/10.3390/s23136057 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Arhab, Muhammad
Huang, Jingshui
Determination of Optimal Predictors and Sampling Frequency to Develop Nutrient Soft Sensors Using Random Forest
title Determination of Optimal Predictors and Sampling Frequency to Develop Nutrient Soft Sensors Using Random Forest
title_full Determination of Optimal Predictors and Sampling Frequency to Develop Nutrient Soft Sensors Using Random Forest
title_fullStr Determination of Optimal Predictors and Sampling Frequency to Develop Nutrient Soft Sensors Using Random Forest
title_full_unstemmed Determination of Optimal Predictors and Sampling Frequency to Develop Nutrient Soft Sensors Using Random Forest
title_short Determination of Optimal Predictors and Sampling Frequency to Develop Nutrient Soft Sensors Using Random Forest
title_sort determination of optimal predictors and sampling frequency to develop nutrient soft sensors using random forest
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10346477/
https://www.ncbi.nlm.nih.gov/pubmed/37447905
http://dx.doi.org/10.3390/s23136057
work_keys_str_mv AT arhabmuhammad determinationofoptimalpredictorsandsamplingfrequencytodevelopnutrientsoftsensorsusingrandomforest
AT huangjingshui determinationofoptimalpredictorsandsamplingfrequencytodevelopnutrientsoftsensorsusingrandomforest