Cargando…
Comparison between Variable-Selection Algorithms in PLS Regression with Near-Infrared Spectroscopy to Predict Selected Metals in Soil
Soil is one of the Earth’s most important natural resources. The presence of metals can decrease environmental quality if present in excessive amounts. Analyzing soil metal contents can be costly and time consuming, but near-infrared (NIR) spectroscopy coupled with chemometric tools can offer an alt...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10574190/ https://www.ncbi.nlm.nih.gov/pubmed/37836802 http://dx.doi.org/10.3390/molecules28196959 |
_version_ | 1785120636214968320 |
---|---|
author | Abrantes, Giovanna Almeida, Valber Maia, Angelo Jamil Nascimento, Rennan Nascimento, Clistenes Silva, Ygor Silva, Yuri Veras, Germano |
author_facet | Abrantes, Giovanna Almeida, Valber Maia, Angelo Jamil Nascimento, Rennan Nascimento, Clistenes Silva, Ygor Silva, Yuri Veras, Germano |
author_sort | Abrantes, Giovanna |
collection | PubMed |
description | Soil is one of the Earth’s most important natural resources. The presence of metals can decrease environmental quality if present in excessive amounts. Analyzing soil metal contents can be costly and time consuming, but near-infrared (NIR) spectroscopy coupled with chemometric tools can offer an alternative. The most important multivariate calibration method to predict concentrations or physical, chemical or physicochemical properties as a chemometric tool is partial least-squares (PLS) regression. However, a large number of irrelevant variables may cause problems of accuracy in the predictive chemometric models. Thus, stochastic variable-selection techniques, such as the Firefly algorithm by intervals in PLS (FFiPLS), can provide better solutions for specific problems. This study aimed to evaluate the performance of FFiPLS against deterministic PLS algorithms for the prediction of metals in river basin soils. The samples had their spectra collected from the region of 1000–2500 nm. Predictive models were then built from the spectral data, including PLS, interval-PLS (iPLS), successive projections algorithm for interval selection in PLS (iSPA-PLS), and FFiPLS. The chemometric models were built with raw data and preprocessed data by using different methods such as multiplicative scatter correction (MSC), standard normal variate (SNV), mean centering, adjustment of baseline and smoothing by the Savitzky–Golay method. The elliptical joint confidence region (EJCR) used in each chemometric model presented adequate fit. FFiPLS models of iron and titanium obtained a relative prediction deviation (RPD) of more than 2. The chemometric models for determination of aluminum obtained an RPD of more than 2 in the preprocessed data with SNV, MSC and baseline (offset + linear) and with raw data. The metals Be, Gd and Y failed to obtain adequate models in terms of residual prediction deviation (RPD). These results are associated with the low values of metals in the samples. Considering the complexity of the samples, the relative error of prediction (REP) obtained between 10 and 25% of the values adequate for this type of sample. Root mean square error of calibration and prediction (RMSEC and RMSEP, respectively) presented the same profile as the other quality parameters. The FFiPLS algorithm outperformed deterministic algorithms in the construction of models estimating the content of Al, Be, Gd and Y. This study produced chemometric models with variable selection able to determine metals in the Ipojuca River watershed soils using reflectance-mode NIR spectrometry. |
format | Online Article Text |
id | pubmed-10574190 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-105741902023-10-14 Comparison between Variable-Selection Algorithms in PLS Regression with Near-Infrared Spectroscopy to Predict Selected Metals in Soil Abrantes, Giovanna Almeida, Valber Maia, Angelo Jamil Nascimento, Rennan Nascimento, Clistenes Silva, Ygor Silva, Yuri Veras, Germano Molecules Article Soil is one of the Earth’s most important natural resources. The presence of metals can decrease environmental quality if present in excessive amounts. Analyzing soil metal contents can be costly and time consuming, but near-infrared (NIR) spectroscopy coupled with chemometric tools can offer an alternative. The most important multivariate calibration method to predict concentrations or physical, chemical or physicochemical properties as a chemometric tool is partial least-squares (PLS) regression. However, a large number of irrelevant variables may cause problems of accuracy in the predictive chemometric models. Thus, stochastic variable-selection techniques, such as the Firefly algorithm by intervals in PLS (FFiPLS), can provide better solutions for specific problems. This study aimed to evaluate the performance of FFiPLS against deterministic PLS algorithms for the prediction of metals in river basin soils. The samples had their spectra collected from the region of 1000–2500 nm. Predictive models were then built from the spectral data, including PLS, interval-PLS (iPLS), successive projections algorithm for interval selection in PLS (iSPA-PLS), and FFiPLS. The chemometric models were built with raw data and preprocessed data by using different methods such as multiplicative scatter correction (MSC), standard normal variate (SNV), mean centering, adjustment of baseline and smoothing by the Savitzky–Golay method. The elliptical joint confidence region (EJCR) used in each chemometric model presented adequate fit. FFiPLS models of iron and titanium obtained a relative prediction deviation (RPD) of more than 2. The chemometric models for determination of aluminum obtained an RPD of more than 2 in the preprocessed data with SNV, MSC and baseline (offset + linear) and with raw data. The metals Be, Gd and Y failed to obtain adequate models in terms of residual prediction deviation (RPD). These results are associated with the low values of metals in the samples. Considering the complexity of the samples, the relative error of prediction (REP) obtained between 10 and 25% of the values adequate for this type of sample. Root mean square error of calibration and prediction (RMSEC and RMSEP, respectively) presented the same profile as the other quality parameters. The FFiPLS algorithm outperformed deterministic algorithms in the construction of models estimating the content of Al, Be, Gd and Y. This study produced chemometric models with variable selection able to determine metals in the Ipojuca River watershed soils using reflectance-mode NIR spectrometry. MDPI 2023-10-06 /pmc/articles/PMC10574190/ /pubmed/37836802 http://dx.doi.org/10.3390/molecules28196959 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Abrantes, Giovanna Almeida, Valber Maia, Angelo Jamil Nascimento, Rennan Nascimento, Clistenes Silva, Ygor Silva, Yuri Veras, Germano Comparison between Variable-Selection Algorithms in PLS Regression with Near-Infrared Spectroscopy to Predict Selected Metals in Soil |
title | Comparison between Variable-Selection Algorithms in PLS Regression with Near-Infrared Spectroscopy to Predict Selected Metals in Soil |
title_full | Comparison between Variable-Selection Algorithms in PLS Regression with Near-Infrared Spectroscopy to Predict Selected Metals in Soil |
title_fullStr | Comparison between Variable-Selection Algorithms in PLS Regression with Near-Infrared Spectroscopy to Predict Selected Metals in Soil |
title_full_unstemmed | Comparison between Variable-Selection Algorithms in PLS Regression with Near-Infrared Spectroscopy to Predict Selected Metals in Soil |
title_short | Comparison between Variable-Selection Algorithms in PLS Regression with Near-Infrared Spectroscopy to Predict Selected Metals in Soil |
title_sort | comparison between variable-selection algorithms in pls regression with near-infrared spectroscopy to predict selected metals in soil |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10574190/ https://www.ncbi.nlm.nih.gov/pubmed/37836802 http://dx.doi.org/10.3390/molecules28196959 |
work_keys_str_mv | AT abrantesgiovanna comparisonbetweenvariableselectionalgorithmsinplsregressionwithnearinfraredspectroscopytopredictselectedmetalsinsoil AT almeidavalber comparisonbetweenvariableselectionalgorithmsinplsregressionwithnearinfraredspectroscopytopredictselectedmetalsinsoil AT maiaangelojamil comparisonbetweenvariableselectionalgorithmsinplsregressionwithnearinfraredspectroscopytopredictselectedmetalsinsoil AT nascimentorennan comparisonbetweenvariableselectionalgorithmsinplsregressionwithnearinfraredspectroscopytopredictselectedmetalsinsoil AT nascimentoclistenes comparisonbetweenvariableselectionalgorithmsinplsregressionwithnearinfraredspectroscopytopredictselectedmetalsinsoil AT silvaygor comparisonbetweenvariableselectionalgorithmsinplsregressionwithnearinfraredspectroscopytopredictselectedmetalsinsoil AT silvayuri comparisonbetweenvariableselectionalgorithmsinplsregressionwithnearinfraredspectroscopytopredictselectedmetalsinsoil AT verasgermano comparisonbetweenvariableselectionalgorithmsinplsregressionwithnearinfraredspectroscopytopredictselectedmetalsinsoil |