Cargando…
Determination of Optimal Predictors and Sampling Frequency to Develop Nutrient Soft Sensors Using Random Forest
Despite advancements in sensor technology, monitoring nutrients in situ and in real-time is still challenging and expensive. Soft sensors, based on data-driven models, offer an alternative to direct nutrient measurements. However, the high demand for data required for their development poses logisti...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10346477/ https://www.ncbi.nlm.nih.gov/pubmed/37447905 http://dx.doi.org/10.3390/s23136057 |
_version_ | 1785073322846846976 |
---|---|
author | Arhab, Muhammad Huang, Jingshui |
author_facet | Arhab, Muhammad Huang, Jingshui |
author_sort | Arhab, Muhammad |
collection | PubMed |
description | Despite advancements in sensor technology, monitoring nutrients in situ and in real-time is still challenging and expensive. Soft sensors, based on data-driven models, offer an alternative to direct nutrient measurements. However, the high demand for data required for their development poses logistical issues with data handling. To address this, the study aimed to determine the optimal subset of predictors and the sampling frequency for developing nutrient soft sensors using random forest. The study used water quality data at 15-min intervals from 2 automatic stations on the Main River, Germany, and included dissolved oxygen, temperature, conductivity, pH, streamflow, and cyclical time features as predictors. The optimal subset of predictors was identified using forward subset selection, and the models fitted with the optimal predictors produced R(2) values above 0.95 for nitrate, orthophosphate, and ammonium for both stations. The study then trained the models on 40 sampling frequencies, ranging from monthly to 15-min intervals. The results showed that as the sampling frequency increased, the model’s performance, measured by RMSE, improved. The optimal balance between sampling frequency and model performance was identified using a knee-point determination algorithm. The optimal sampling frequency for nitrate was 3.6 and 2.8 h for the 2 stations, respectively. For orthophosphate, it was 2.4 and 1.8 h. For ammonium, it was 2.2 h for 1 station. The study highlights the utility of surrogate models for monitoring nutrient levels and demonstrates that nutrient soft sensors can function with fewer predictors at lower frequencies without significantly decreasing performance. |
format | Online Article Text |
id | pubmed-10346477 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-103464772023-07-15 Determination of Optimal Predictors and Sampling Frequency to Develop Nutrient Soft Sensors Using Random Forest Arhab, Muhammad Huang, Jingshui Sensors (Basel) Article Despite advancements in sensor technology, monitoring nutrients in situ and in real-time is still challenging and expensive. Soft sensors, based on data-driven models, offer an alternative to direct nutrient measurements. However, the high demand for data required for their development poses logistical issues with data handling. To address this, the study aimed to determine the optimal subset of predictors and the sampling frequency for developing nutrient soft sensors using random forest. The study used water quality data at 15-min intervals from 2 automatic stations on the Main River, Germany, and included dissolved oxygen, temperature, conductivity, pH, streamflow, and cyclical time features as predictors. The optimal subset of predictors was identified using forward subset selection, and the models fitted with the optimal predictors produced R(2) values above 0.95 for nitrate, orthophosphate, and ammonium for both stations. The study then trained the models on 40 sampling frequencies, ranging from monthly to 15-min intervals. The results showed that as the sampling frequency increased, the model’s performance, measured by RMSE, improved. The optimal balance between sampling frequency and model performance was identified using a knee-point determination algorithm. The optimal sampling frequency for nitrate was 3.6 and 2.8 h for the 2 stations, respectively. For orthophosphate, it was 2.4 and 1.8 h. For ammonium, it was 2.2 h for 1 station. The study highlights the utility of surrogate models for monitoring nutrient levels and demonstrates that nutrient soft sensors can function with fewer predictors at lower frequencies without significantly decreasing performance. MDPI 2023-06-30 /pmc/articles/PMC10346477/ /pubmed/37447905 http://dx.doi.org/10.3390/s23136057 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Arhab, Muhammad Huang, Jingshui Determination of Optimal Predictors and Sampling Frequency to Develop Nutrient Soft Sensors Using Random Forest |
title | Determination of Optimal Predictors and Sampling Frequency to Develop Nutrient Soft Sensors Using Random Forest |
title_full | Determination of Optimal Predictors and Sampling Frequency to Develop Nutrient Soft Sensors Using Random Forest |
title_fullStr | Determination of Optimal Predictors and Sampling Frequency to Develop Nutrient Soft Sensors Using Random Forest |
title_full_unstemmed | Determination of Optimal Predictors and Sampling Frequency to Develop Nutrient Soft Sensors Using Random Forest |
title_short | Determination of Optimal Predictors and Sampling Frequency to Develop Nutrient Soft Sensors Using Random Forest |
title_sort | determination of optimal predictors and sampling frequency to develop nutrient soft sensors using random forest |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10346477/ https://www.ncbi.nlm.nih.gov/pubmed/37447905 http://dx.doi.org/10.3390/s23136057 |
work_keys_str_mv | AT arhabmuhammad determinationofoptimalpredictorsandsamplingfrequencytodevelopnutrientsoftsensorsusingrandomforest AT huangjingshui determinationofoptimalpredictorsandsamplingfrequencytodevelopnutrientsoftsensorsusingrandomforest |