Cargando…

Pursuit of the Ultimate Regression Model for Samarium(III), Europium(III), and LiCl Using Laser-Induced Fluorescence, Design of Experiments, and a Genetic Algorithm for Feature Selection

[Image: see text] Laser-induced fluorescence spectroscopy, Raman scattering, and partial least squares regression models were optimized for the quantification of samarium (0–150 μg mL(–1)), europium (0–75 μg mL(–1)), and lithium chloride (0.1–12 M) with a transformational preprocessing strategy. Sel...

Descripción completa

Detalles Bibliográficos
Autores principales: Andrews, Hunter B., Sadergaski, Luke R., Cary, Samantha K.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2023
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9850777/
https://www.ncbi.nlm.nih.gov/pubmed/36687031
http://dx.doi.org/10.1021/acsomega.2c06610
_version_ 1784872259398139904
author Andrews, Hunter B.
Sadergaski, Luke R.
Cary, Samantha K.
author_facet Andrews, Hunter B.
Sadergaski, Luke R.
Cary, Samantha K.
author_sort Andrews, Hunter B.
collection PubMed
description [Image: see text] Laser-induced fluorescence spectroscopy, Raman scattering, and partial least squares regression models were optimized for the quantification of samarium (0–150 μg mL(–1)), europium (0–75 μg mL(–1)), and lithium chloride (0.1–12 M) with a transformational preprocessing strategy. Selecting combinations of preprocessing methods to optimize the prediction performance of regression models is frequently a major bottleneck for chemometric analysis. Here, we propose an optimization tool using an innovative combination of optimal experimental designs for selecting preprocessing transformation and a genetic algorithm (GA) for feature selection. A D-optimal design containing 26 samples (i.e., combinations of preprocessing strategies) and a user-defined design (576 samples) did not statistically lower the root mean square error of the prediction (RMSEP). The greatest improvement in prediction performance was achieved when a GA was used for feature selection. This feature selection greatly lowered RMSEP statistics by an average of 53%, resulting in the top models with percent RMSEP values of 0.91, 3.5, and 2.1% for Sm(III), Eu(III), and LiCl, respectively. These results indicate that preprocessing corrections (e.g., scatter, scaling, noise, and baseline) alone cannot realize the optimal regression model; feature selection is a more crucial aspect to consider. This unique approach provides a powerful tool for approaching the true optimum prediction performance and can be applied to numerous fields of spectroscopy and chemometrics to rapidly construct models.
format Online
Article
Text
id pubmed-9850777
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher American Chemical Society
record_format MEDLINE/PubMed
spelling pubmed-98507772023-01-20 Pursuit of the Ultimate Regression Model for Samarium(III), Europium(III), and LiCl Using Laser-Induced Fluorescence, Design of Experiments, and a Genetic Algorithm for Feature Selection Andrews, Hunter B. Sadergaski, Luke R. Cary, Samantha K. ACS Omega [Image: see text] Laser-induced fluorescence spectroscopy, Raman scattering, and partial least squares regression models were optimized for the quantification of samarium (0–150 μg mL(–1)), europium (0–75 μg mL(–1)), and lithium chloride (0.1–12 M) with a transformational preprocessing strategy. Selecting combinations of preprocessing methods to optimize the prediction performance of regression models is frequently a major bottleneck for chemometric analysis. Here, we propose an optimization tool using an innovative combination of optimal experimental designs for selecting preprocessing transformation and a genetic algorithm (GA) for feature selection. A D-optimal design containing 26 samples (i.e., combinations of preprocessing strategies) and a user-defined design (576 samples) did not statistically lower the root mean square error of the prediction (RMSEP). The greatest improvement in prediction performance was achieved when a GA was used for feature selection. This feature selection greatly lowered RMSEP statistics by an average of 53%, resulting in the top models with percent RMSEP values of 0.91, 3.5, and 2.1% for Sm(III), Eu(III), and LiCl, respectively. These results indicate that preprocessing corrections (e.g., scatter, scaling, noise, and baseline) alone cannot realize the optimal regression model; feature selection is a more crucial aspect to consider. This unique approach provides a powerful tool for approaching the true optimum prediction performance and can be applied to numerous fields of spectroscopy and chemometrics to rapidly construct models. American Chemical Society 2023-01-03 /pmc/articles/PMC9850777/ /pubmed/36687031 http://dx.doi.org/10.1021/acsomega.2c06610 Text en © 2023 The Authors. Published by American Chemical Society https://creativecommons.org/licenses/by-nc-nd/4.0/Permits non-commercial access and re-use, provided that author attribution and integrity are maintained; but does not permit creation of adaptations or other derivative works (https://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Andrews, Hunter B.
Sadergaski, Luke R.
Cary, Samantha K.
Pursuit of the Ultimate Regression Model for Samarium(III), Europium(III), and LiCl Using Laser-Induced Fluorescence, Design of Experiments, and a Genetic Algorithm for Feature Selection
title Pursuit of the Ultimate Regression Model for Samarium(III), Europium(III), and LiCl Using Laser-Induced Fluorescence, Design of Experiments, and a Genetic Algorithm for Feature Selection
title_full Pursuit of the Ultimate Regression Model for Samarium(III), Europium(III), and LiCl Using Laser-Induced Fluorescence, Design of Experiments, and a Genetic Algorithm for Feature Selection
title_fullStr Pursuit of the Ultimate Regression Model for Samarium(III), Europium(III), and LiCl Using Laser-Induced Fluorescence, Design of Experiments, and a Genetic Algorithm for Feature Selection
title_full_unstemmed Pursuit of the Ultimate Regression Model for Samarium(III), Europium(III), and LiCl Using Laser-Induced Fluorescence, Design of Experiments, and a Genetic Algorithm for Feature Selection
title_short Pursuit of the Ultimate Regression Model for Samarium(III), Europium(III), and LiCl Using Laser-Induced Fluorescence, Design of Experiments, and a Genetic Algorithm for Feature Selection
title_sort pursuit of the ultimate regression model for samarium(iii), europium(iii), and licl using laser-induced fluorescence, design of experiments, and a genetic algorithm for feature selection
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9850777/
https://www.ncbi.nlm.nih.gov/pubmed/36687031
http://dx.doi.org/10.1021/acsomega.2c06610
work_keys_str_mv AT andrewshunterb pursuitoftheultimateregressionmodelforsamariumiiieuropiumiiiandliclusinglaserinducedfluorescencedesignofexperimentsandageneticalgorithmforfeatureselection
AT sadergaskiluker pursuitoftheultimateregressionmodelforsamariumiiieuropiumiiiandliclusinglaserinducedfluorescencedesignofexperimentsandageneticalgorithmforfeatureselection
AT carysamanthak pursuitoftheultimateregressionmodelforsamariumiiieuropiumiiiandliclusinglaserinducedfluorescencedesignofexperimentsandageneticalgorithmforfeatureselection