Cargando…

Machine Learning Analysis of Raman Spectra To Quantify the Organic Constituents in Complex Organic–Mineral Mixtures

[Image: see text] Important decisions in local agricultural policy and practice often hinge on the soil’s chemical composition. Raman spectroscopy offers a rapid noninvasive means to quantify the constituents of complex organic systems. But the application of Raman spectroscopy to soils presents a m...

Descripción completa

Detalles Bibliográficos
Autores principales: Zarei, Mahsa, Solomatova, Natalia V., Aghaei, Hoda, Rothwell, Austin, Wiens, Jeffrey, Melo, Luke, Good, Travis G., Shokatian, Sadegh, Grant, Edward
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2023
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10620774/
https://www.ncbi.nlm.nih.gov/pubmed/37698955
http://dx.doi.org/10.1021/acs.analchem.3c02348
_version_ 1785130273307885568
author Zarei, Mahsa
Solomatova, Natalia V.
Aghaei, Hoda
Rothwell, Austin
Wiens, Jeffrey
Melo, Luke
Good, Travis G.
Shokatian, Sadegh
Grant, Edward
author_facet Zarei, Mahsa
Solomatova, Natalia V.
Aghaei, Hoda
Rothwell, Austin
Wiens, Jeffrey
Melo, Luke
Good, Travis G.
Shokatian, Sadegh
Grant, Edward
author_sort Zarei, Mahsa
collection PubMed
description [Image: see text] Important decisions in local agricultural policy and practice often hinge on the soil’s chemical composition. Raman spectroscopy offers a rapid noninvasive means to quantify the constituents of complex organic systems. But the application of Raman spectroscopy to soils presents a multifaceted challenge due to organic/mineral compositional complexity and spectral interference arising from overwhelming fluorescence. The present work compares methodologies with the capacity to help overcome common obstacles that arise in the analysis of soils. We created conditions representative of these challenges by combining varying proportions of six amino acids commonly found in soils with fluorescent bentonite clay and coarse mineral components. Referring to an extensive data set of Raman spectra, we compare the performance of the convolutional neural network (CNN) and partial least-squares regression (PLSR) multivariate models for amino acid composition. Strategies employing volume-averaged spectral sampling and data preprocessing algorithms improve the predictive power of these models. Our average test R(2) for PLSR models exceeds 0.89 and approaches 0.98, depending on the complexity of the matrix, whereas CNN yields an R(2) range from 0.91 to 0.97, demonstrating that classic PLSR and CNN perform comparably, except in cases where the signal-to-noise ratio of the organic component is very low, whereupon CNN models outperform. Artificially isolating two of the most prevalent obstacles in evaluating the Raman spectra of soils, we have characterized the effect of each obstacle on the performance of machine learning models in the absence of other complexities. These results highlight important considerations and modeling strategies necessary to improve the Raman analysis of organic compounds in complex mixtures in the presence of mineral spectral components and significant fluorescence.
format Online
Article
Text
id pubmed-10620774
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher American Chemical Society
record_format MEDLINE/PubMed
spelling pubmed-106207742023-11-03 Machine Learning Analysis of Raman Spectra To Quantify the Organic Constituents in Complex Organic–Mineral Mixtures Zarei, Mahsa Solomatova, Natalia V. Aghaei, Hoda Rothwell, Austin Wiens, Jeffrey Melo, Luke Good, Travis G. Shokatian, Sadegh Grant, Edward Anal Chem [Image: see text] Important decisions in local agricultural policy and practice often hinge on the soil’s chemical composition. Raman spectroscopy offers a rapid noninvasive means to quantify the constituents of complex organic systems. But the application of Raman spectroscopy to soils presents a multifaceted challenge due to organic/mineral compositional complexity and spectral interference arising from overwhelming fluorescence. The present work compares methodologies with the capacity to help overcome common obstacles that arise in the analysis of soils. We created conditions representative of these challenges by combining varying proportions of six amino acids commonly found in soils with fluorescent bentonite clay and coarse mineral components. Referring to an extensive data set of Raman spectra, we compare the performance of the convolutional neural network (CNN) and partial least-squares regression (PLSR) multivariate models for amino acid composition. Strategies employing volume-averaged spectral sampling and data preprocessing algorithms improve the predictive power of these models. Our average test R(2) for PLSR models exceeds 0.89 and approaches 0.98, depending on the complexity of the matrix, whereas CNN yields an R(2) range from 0.91 to 0.97, demonstrating that classic PLSR and CNN perform comparably, except in cases where the signal-to-noise ratio of the organic component is very low, whereupon CNN models outperform. Artificially isolating two of the most prevalent obstacles in evaluating the Raman spectra of soils, we have characterized the effect of each obstacle on the performance of machine learning models in the absence of other complexities. These results highlight important considerations and modeling strategies necessary to improve the Raman analysis of organic compounds in complex mixtures in the presence of mineral spectral components and significant fluorescence. American Chemical Society 2023-09-12 /pmc/articles/PMC10620774/ /pubmed/37698955 http://dx.doi.org/10.1021/acs.analchem.3c02348 Text en © 2023 The Authors. Published by American Chemical Society https://creativecommons.org/licenses/by-nc-nd/4.0/Permits non-commercial access and re-use, provided that author attribution and integrity are maintained; but does not permit creation of adaptations or other derivative works (https://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Zarei, Mahsa
Solomatova, Natalia V.
Aghaei, Hoda
Rothwell, Austin
Wiens, Jeffrey
Melo, Luke
Good, Travis G.
Shokatian, Sadegh
Grant, Edward
Machine Learning Analysis of Raman Spectra To Quantify the Organic Constituents in Complex Organic–Mineral Mixtures
title Machine Learning Analysis of Raman Spectra To Quantify the Organic Constituents in Complex Organic–Mineral Mixtures
title_full Machine Learning Analysis of Raman Spectra To Quantify the Organic Constituents in Complex Organic–Mineral Mixtures
title_fullStr Machine Learning Analysis of Raman Spectra To Quantify the Organic Constituents in Complex Organic–Mineral Mixtures
title_full_unstemmed Machine Learning Analysis of Raman Spectra To Quantify the Organic Constituents in Complex Organic–Mineral Mixtures
title_short Machine Learning Analysis of Raman Spectra To Quantify the Organic Constituents in Complex Organic–Mineral Mixtures
title_sort machine learning analysis of raman spectra to quantify the organic constituents in complex organic–mineral mixtures
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10620774/
https://www.ncbi.nlm.nih.gov/pubmed/37698955
http://dx.doi.org/10.1021/acs.analchem.3c02348
work_keys_str_mv AT zareimahsa machinelearninganalysisoframanspectratoquantifytheorganicconstituentsincomplexorganicmineralmixtures
AT solomatovanataliav machinelearninganalysisoframanspectratoquantifytheorganicconstituentsincomplexorganicmineralmixtures
AT aghaeihoda machinelearninganalysisoframanspectratoquantifytheorganicconstituentsincomplexorganicmineralmixtures
AT rothwellaustin machinelearninganalysisoframanspectratoquantifytheorganicconstituentsincomplexorganicmineralmixtures
AT wiensjeffrey machinelearninganalysisoframanspectratoquantifytheorganicconstituentsincomplexorganicmineralmixtures
AT meloluke machinelearninganalysisoframanspectratoquantifytheorganicconstituentsincomplexorganicmineralmixtures
AT goodtravisg machinelearninganalysisoframanspectratoquantifytheorganicconstituentsincomplexorganicmineralmixtures
AT shokatiansadegh machinelearninganalysisoframanspectratoquantifytheorganicconstituentsincomplexorganicmineralmixtures
AT grantedward machinelearninganalysisoframanspectratoquantifytheorganicconstituentsincomplexorganicmineralmixtures