Cargando…
Pursuit of the Ultimate Regression Model for Samarium(III), Europium(III), and LiCl Using Laser-Induced Fluorescence, Design of Experiments, and a Genetic Algorithm for Feature Selection
[Image: see text] Laser-induced fluorescence spectroscopy, Raman scattering, and partial least squares regression models were optimized for the quantification of samarium (0–150 μg mL(–1)), europium (0–75 μg mL(–1)), and lithium chloride (0.1–12 M) with a transformational preprocessing strategy. Sel...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Chemical Society
2023
|
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9850777/ https://www.ncbi.nlm.nih.gov/pubmed/36687031 http://dx.doi.org/10.1021/acsomega.2c06610 |
_version_ | 1784872259398139904 |
---|---|
author | Andrews, Hunter B. Sadergaski, Luke R. Cary, Samantha K. |
author_facet | Andrews, Hunter B. Sadergaski, Luke R. Cary, Samantha K. |
author_sort | Andrews, Hunter B. |
collection | PubMed |
description | [Image: see text] Laser-induced fluorescence spectroscopy, Raman scattering, and partial least squares regression models were optimized for the quantification of samarium (0–150 μg mL(–1)), europium (0–75 μg mL(–1)), and lithium chloride (0.1–12 M) with a transformational preprocessing strategy. Selecting combinations of preprocessing methods to optimize the prediction performance of regression models is frequently a major bottleneck for chemometric analysis. Here, we propose an optimization tool using an innovative combination of optimal experimental designs for selecting preprocessing transformation and a genetic algorithm (GA) for feature selection. A D-optimal design containing 26 samples (i.e., combinations of preprocessing strategies) and a user-defined design (576 samples) did not statistically lower the root mean square error of the prediction (RMSEP). The greatest improvement in prediction performance was achieved when a GA was used for feature selection. This feature selection greatly lowered RMSEP statistics by an average of 53%, resulting in the top models with percent RMSEP values of 0.91, 3.5, and 2.1% for Sm(III), Eu(III), and LiCl, respectively. These results indicate that preprocessing corrections (e.g., scatter, scaling, noise, and baseline) alone cannot realize the optimal regression model; feature selection is a more crucial aspect to consider. This unique approach provides a powerful tool for approaching the true optimum prediction performance and can be applied to numerous fields of spectroscopy and chemometrics to rapidly construct models. |
format | Online Article Text |
id | pubmed-9850777 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | American Chemical Society |
record_format | MEDLINE/PubMed |
spelling | pubmed-98507772023-01-20 Pursuit of the Ultimate Regression Model for Samarium(III), Europium(III), and LiCl Using Laser-Induced Fluorescence, Design of Experiments, and a Genetic Algorithm for Feature Selection Andrews, Hunter B. Sadergaski, Luke R. Cary, Samantha K. ACS Omega [Image: see text] Laser-induced fluorescence spectroscopy, Raman scattering, and partial least squares regression models were optimized for the quantification of samarium (0–150 μg mL(–1)), europium (0–75 μg mL(–1)), and lithium chloride (0.1–12 M) with a transformational preprocessing strategy. Selecting combinations of preprocessing methods to optimize the prediction performance of regression models is frequently a major bottleneck for chemometric analysis. Here, we propose an optimization tool using an innovative combination of optimal experimental designs for selecting preprocessing transformation and a genetic algorithm (GA) for feature selection. A D-optimal design containing 26 samples (i.e., combinations of preprocessing strategies) and a user-defined design (576 samples) did not statistically lower the root mean square error of the prediction (RMSEP). The greatest improvement in prediction performance was achieved when a GA was used for feature selection. This feature selection greatly lowered RMSEP statistics by an average of 53%, resulting in the top models with percent RMSEP values of 0.91, 3.5, and 2.1% for Sm(III), Eu(III), and LiCl, respectively. These results indicate that preprocessing corrections (e.g., scatter, scaling, noise, and baseline) alone cannot realize the optimal regression model; feature selection is a more crucial aspect to consider. This unique approach provides a powerful tool for approaching the true optimum prediction performance and can be applied to numerous fields of spectroscopy and chemometrics to rapidly construct models. American Chemical Society 2023-01-03 /pmc/articles/PMC9850777/ /pubmed/36687031 http://dx.doi.org/10.1021/acsomega.2c06610 Text en © 2023 The Authors. Published by American Chemical Society https://creativecommons.org/licenses/by-nc-nd/4.0/Permits non-commercial access and re-use, provided that author attribution and integrity are maintained; but does not permit creation of adaptations or other derivative works (https://creativecommons.org/licenses/by-nc-nd/4.0/). |
spellingShingle | Andrews, Hunter B. Sadergaski, Luke R. Cary, Samantha K. Pursuit of the Ultimate Regression Model for Samarium(III), Europium(III), and LiCl Using Laser-Induced Fluorescence, Design of Experiments, and a Genetic Algorithm for Feature Selection |
title | Pursuit of the
Ultimate Regression Model for Samarium(III),
Europium(III), and LiCl Using Laser-Induced Fluorescence, Design of
Experiments, and a Genetic Algorithm for Feature Selection |
title_full | Pursuit of the
Ultimate Regression Model for Samarium(III),
Europium(III), and LiCl Using Laser-Induced Fluorescence, Design of
Experiments, and a Genetic Algorithm for Feature Selection |
title_fullStr | Pursuit of the
Ultimate Regression Model for Samarium(III),
Europium(III), and LiCl Using Laser-Induced Fluorescence, Design of
Experiments, and a Genetic Algorithm for Feature Selection |
title_full_unstemmed | Pursuit of the
Ultimate Regression Model for Samarium(III),
Europium(III), and LiCl Using Laser-Induced Fluorescence, Design of
Experiments, and a Genetic Algorithm for Feature Selection |
title_short | Pursuit of the
Ultimate Regression Model for Samarium(III),
Europium(III), and LiCl Using Laser-Induced Fluorescence, Design of
Experiments, and a Genetic Algorithm for Feature Selection |
title_sort | pursuit of the
ultimate regression model for samarium(iii),
europium(iii), and licl using laser-induced fluorescence, design of
experiments, and a genetic algorithm for feature selection |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9850777/ https://www.ncbi.nlm.nih.gov/pubmed/36687031 http://dx.doi.org/10.1021/acsomega.2c06610 |
work_keys_str_mv | AT andrewshunterb pursuitoftheultimateregressionmodelforsamariumiiieuropiumiiiandliclusinglaserinducedfluorescencedesignofexperimentsandageneticalgorithmforfeatureselection AT sadergaskiluker pursuitoftheultimateregressionmodelforsamariumiiieuropiumiiiandliclusinglaserinducedfluorescencedesignofexperimentsandageneticalgorithmforfeatureselection AT carysamanthak pursuitoftheultimateregressionmodelforsamariumiiieuropiumiiiandliclusinglaserinducedfluorescencedesignofexperimentsandageneticalgorithmforfeatureselection |