Cargando…

Comparison between Variable-Selection Algorithms in PLS Regression with Near-Infrared Spectroscopy to Predict Selected Metals in Soil

Soil is one of the Earth’s most important natural resources. The presence of metals can decrease environmental quality if present in excessive amounts. Analyzing soil metal contents can be costly and time consuming, but near-infrared (NIR) spectroscopy coupled with chemometric tools can offer an alt...

Descripción completa

Detalles Bibliográficos
Autores principales: Abrantes, Giovanna, Almeida, Valber, Maia, Angelo Jamil, Nascimento, Rennan, Nascimento, Clistenes, Silva, Ygor, Silva, Yuri, Veras, Germano
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10574190/
https://www.ncbi.nlm.nih.gov/pubmed/37836802
http://dx.doi.org/10.3390/molecules28196959
_version_ 1785120636214968320
author Abrantes, Giovanna
Almeida, Valber
Maia, Angelo Jamil
Nascimento, Rennan
Nascimento, Clistenes
Silva, Ygor
Silva, Yuri
Veras, Germano
author_facet Abrantes, Giovanna
Almeida, Valber
Maia, Angelo Jamil
Nascimento, Rennan
Nascimento, Clistenes
Silva, Ygor
Silva, Yuri
Veras, Germano
author_sort Abrantes, Giovanna
collection PubMed
description Soil is one of the Earth’s most important natural resources. The presence of metals can decrease environmental quality if present in excessive amounts. Analyzing soil metal contents can be costly and time consuming, but near-infrared (NIR) spectroscopy coupled with chemometric tools can offer an alternative. The most important multivariate calibration method to predict concentrations or physical, chemical or physicochemical properties as a chemometric tool is partial least-squares (PLS) regression. However, a large number of irrelevant variables may cause problems of accuracy in the predictive chemometric models. Thus, stochastic variable-selection techniques, such as the Firefly algorithm by intervals in PLS (FFiPLS), can provide better solutions for specific problems. This study aimed to evaluate the performance of FFiPLS against deterministic PLS algorithms for the prediction of metals in river basin soils. The samples had their spectra collected from the region of 1000–2500 nm. Predictive models were then built from the spectral data, including PLS, interval-PLS (iPLS), successive projections algorithm for interval selection in PLS (iSPA-PLS), and FFiPLS. The chemometric models were built with raw data and preprocessed data by using different methods such as multiplicative scatter correction (MSC), standard normal variate (SNV), mean centering, adjustment of baseline and smoothing by the Savitzky–Golay method. The elliptical joint confidence region (EJCR) used in each chemometric model presented adequate fit. FFiPLS models of iron and titanium obtained a relative prediction deviation (RPD) of more than 2. The chemometric models for determination of aluminum obtained an RPD of more than 2 in the preprocessed data with SNV, MSC and baseline (offset + linear) and with raw data. The metals Be, Gd and Y failed to obtain adequate models in terms of residual prediction deviation (RPD). These results are associated with the low values of metals in the samples. Considering the complexity of the samples, the relative error of prediction (REP) obtained between 10 and 25% of the values adequate for this type of sample. Root mean square error of calibration and prediction (RMSEC and RMSEP, respectively) presented the same profile as the other quality parameters. The FFiPLS algorithm outperformed deterministic algorithms in the construction of models estimating the content of Al, Be, Gd and Y. This study produced chemometric models with variable selection able to determine metals in the Ipojuca River watershed soils using reflectance-mode NIR spectrometry.
format Online
Article
Text
id pubmed-10574190
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-105741902023-10-14 Comparison between Variable-Selection Algorithms in PLS Regression with Near-Infrared Spectroscopy to Predict Selected Metals in Soil Abrantes, Giovanna Almeida, Valber Maia, Angelo Jamil Nascimento, Rennan Nascimento, Clistenes Silva, Ygor Silva, Yuri Veras, Germano Molecules Article Soil is one of the Earth’s most important natural resources. The presence of metals can decrease environmental quality if present in excessive amounts. Analyzing soil metal contents can be costly and time consuming, but near-infrared (NIR) spectroscopy coupled with chemometric tools can offer an alternative. The most important multivariate calibration method to predict concentrations or physical, chemical or physicochemical properties as a chemometric tool is partial least-squares (PLS) regression. However, a large number of irrelevant variables may cause problems of accuracy in the predictive chemometric models. Thus, stochastic variable-selection techniques, such as the Firefly algorithm by intervals in PLS (FFiPLS), can provide better solutions for specific problems. This study aimed to evaluate the performance of FFiPLS against deterministic PLS algorithms for the prediction of metals in river basin soils. The samples had their spectra collected from the region of 1000–2500 nm. Predictive models were then built from the spectral data, including PLS, interval-PLS (iPLS), successive projections algorithm for interval selection in PLS (iSPA-PLS), and FFiPLS. The chemometric models were built with raw data and preprocessed data by using different methods such as multiplicative scatter correction (MSC), standard normal variate (SNV), mean centering, adjustment of baseline and smoothing by the Savitzky–Golay method. The elliptical joint confidence region (EJCR) used in each chemometric model presented adequate fit. FFiPLS models of iron and titanium obtained a relative prediction deviation (RPD) of more than 2. The chemometric models for determination of aluminum obtained an RPD of more than 2 in the preprocessed data with SNV, MSC and baseline (offset + linear) and with raw data. The metals Be, Gd and Y failed to obtain adequate models in terms of residual prediction deviation (RPD). These results are associated with the low values of metals in the samples. Considering the complexity of the samples, the relative error of prediction (REP) obtained between 10 and 25% of the values adequate for this type of sample. Root mean square error of calibration and prediction (RMSEC and RMSEP, respectively) presented the same profile as the other quality parameters. The FFiPLS algorithm outperformed deterministic algorithms in the construction of models estimating the content of Al, Be, Gd and Y. This study produced chemometric models with variable selection able to determine metals in the Ipojuca River watershed soils using reflectance-mode NIR spectrometry. MDPI 2023-10-06 /pmc/articles/PMC10574190/ /pubmed/37836802 http://dx.doi.org/10.3390/molecules28196959 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Abrantes, Giovanna
Almeida, Valber
Maia, Angelo Jamil
Nascimento, Rennan
Nascimento, Clistenes
Silva, Ygor
Silva, Yuri
Veras, Germano
Comparison between Variable-Selection Algorithms in PLS Regression with Near-Infrared Spectroscopy to Predict Selected Metals in Soil
title Comparison between Variable-Selection Algorithms in PLS Regression with Near-Infrared Spectroscopy to Predict Selected Metals in Soil
title_full Comparison between Variable-Selection Algorithms in PLS Regression with Near-Infrared Spectroscopy to Predict Selected Metals in Soil
title_fullStr Comparison between Variable-Selection Algorithms in PLS Regression with Near-Infrared Spectroscopy to Predict Selected Metals in Soil
title_full_unstemmed Comparison between Variable-Selection Algorithms in PLS Regression with Near-Infrared Spectroscopy to Predict Selected Metals in Soil
title_short Comparison between Variable-Selection Algorithms in PLS Regression with Near-Infrared Spectroscopy to Predict Selected Metals in Soil
title_sort comparison between variable-selection algorithms in pls regression with near-infrared spectroscopy to predict selected metals in soil
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10574190/
https://www.ncbi.nlm.nih.gov/pubmed/37836802
http://dx.doi.org/10.3390/molecules28196959
work_keys_str_mv AT abrantesgiovanna comparisonbetweenvariableselectionalgorithmsinplsregressionwithnearinfraredspectroscopytopredictselectedmetalsinsoil
AT almeidavalber comparisonbetweenvariableselectionalgorithmsinplsregressionwithnearinfraredspectroscopytopredictselectedmetalsinsoil
AT maiaangelojamil comparisonbetweenvariableselectionalgorithmsinplsregressionwithnearinfraredspectroscopytopredictselectedmetalsinsoil
AT nascimentorennan comparisonbetweenvariableselectionalgorithmsinplsregressionwithnearinfraredspectroscopytopredictselectedmetalsinsoil
AT nascimentoclistenes comparisonbetweenvariableselectionalgorithmsinplsregressionwithnearinfraredspectroscopytopredictselectedmetalsinsoil
AT silvaygor comparisonbetweenvariableselectionalgorithmsinplsregressionwithnearinfraredspectroscopytopredictselectedmetalsinsoil
AT silvayuri comparisonbetweenvariableselectionalgorithmsinplsregressionwithnearinfraredspectroscopytopredictselectedmetalsinsoil
AT verasgermano comparisonbetweenvariableselectionalgorithmsinplsregressionwithnearinfraredspectroscopytopredictselectedmetalsinsoil