Cargando…
Array-Based Machine Learning for Functional Group Detection in Electron Ionization Mass Spectrometry
[Image: see text] Mass spectrometry is a ubiquitous technique capable of complex chemical analysis. The fragmentation patterns that appear in mass spectrometry are an excellent target for artificial intelligence methods to automate and expedite the analysis of data to identify targets such as functi...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Chemical Society
2023
|
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10339417/ https://www.ncbi.nlm.nih.gov/pubmed/37457446 http://dx.doi.org/10.1021/acsomega.3c01684 |
_version_ | 1785071841009729536 |
---|---|
author | North, Nicole M. Enders, Abigail A. Cable, Morgan L. Allen, Heather C. |
author_facet | North, Nicole M. Enders, Abigail A. Cable, Morgan L. Allen, Heather C. |
author_sort | North, Nicole M. |
collection | PubMed |
description | [Image: see text] Mass spectrometry is a ubiquitous technique capable of complex chemical analysis. The fragmentation patterns that appear in mass spectrometry are an excellent target for artificial intelligence methods to automate and expedite the analysis of data to identify targets such as functional groups. To develop this approach, we trained models on electron ionization (a reproducible hard fragmentation technique) mass spectra so that not only the final model accuracies but also the reasoning behind model assignments could be evaluated. The convolutional neural network (CNN) models were trained on 2D images of the spectra using transfer learning of Inception V3, and the logistic regression models were trained using array-based data and Scikit Learn implementation in Python. Our training dataset consisted of 21,166 mass spectra from the United States’ National Institute of Standards and Technology (NIST) Webbook. The data was used to train models to identify functional groups, both specific (e.g., amines, esters) and generalized classifications (aromatics, oxygen-containing functional groups, and nitrogen-containing functional groups). We found that the highest final accuracies on identifying new data were observed using logistic regression rather than transfer learning on CNN models. It was also determined that the mass range most beneficial for functional group analysis is 0–100 m/z. We also found success in correctly identifying functional groups of example molecules selected from both the NIST database and experimental data. Beyond functional group analysis, we also have developed a methodology to identify impactful fragments for the accurate detection of the models’ targets. The results demonstrate a potential pathway for analyzing and screening substantial amounts of mass spectral data. |
format | Online Article Text |
id | pubmed-10339417 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | American Chemical Society |
record_format | MEDLINE/PubMed |
spelling | pubmed-103394172023-07-14 Array-Based Machine Learning for Functional Group Detection in Electron Ionization Mass Spectrometry North, Nicole M. Enders, Abigail A. Cable, Morgan L. Allen, Heather C. ACS Omega [Image: see text] Mass spectrometry is a ubiquitous technique capable of complex chemical analysis. The fragmentation patterns that appear in mass spectrometry are an excellent target for artificial intelligence methods to automate and expedite the analysis of data to identify targets such as functional groups. To develop this approach, we trained models on electron ionization (a reproducible hard fragmentation technique) mass spectra so that not only the final model accuracies but also the reasoning behind model assignments could be evaluated. The convolutional neural network (CNN) models were trained on 2D images of the spectra using transfer learning of Inception V3, and the logistic regression models were trained using array-based data and Scikit Learn implementation in Python. Our training dataset consisted of 21,166 mass spectra from the United States’ National Institute of Standards and Technology (NIST) Webbook. The data was used to train models to identify functional groups, both specific (e.g., amines, esters) and generalized classifications (aromatics, oxygen-containing functional groups, and nitrogen-containing functional groups). We found that the highest final accuracies on identifying new data were observed using logistic regression rather than transfer learning on CNN models. It was also determined that the mass range most beneficial for functional group analysis is 0–100 m/z. We also found success in correctly identifying functional groups of example molecules selected from both the NIST database and experimental data. Beyond functional group analysis, we also have developed a methodology to identify impactful fragments for the accurate detection of the models’ targets. The results demonstrate a potential pathway for analyzing and screening substantial amounts of mass spectral data. American Chemical Society 2023-06-29 /pmc/articles/PMC10339417/ /pubmed/37457446 http://dx.doi.org/10.1021/acsomega.3c01684 Text en © 2023 The Authors. Published by American Chemical Society https://creativecommons.org/licenses/by-nc-nd/4.0/Permits non-commercial access and re-use, provided that author attribution and integrity are maintained; but does not permit creation of adaptations or other derivative works (https://creativecommons.org/licenses/by-nc-nd/4.0/). |
spellingShingle | North, Nicole M. Enders, Abigail A. Cable, Morgan L. Allen, Heather C. Array-Based Machine Learning for Functional Group Detection in Electron Ionization Mass Spectrometry |
title | Array-Based Machine Learning for Functional Group
Detection in Electron Ionization Mass Spectrometry |
title_full | Array-Based Machine Learning for Functional Group
Detection in Electron Ionization Mass Spectrometry |
title_fullStr | Array-Based Machine Learning for Functional Group
Detection in Electron Ionization Mass Spectrometry |
title_full_unstemmed | Array-Based Machine Learning for Functional Group
Detection in Electron Ionization Mass Spectrometry |
title_short | Array-Based Machine Learning for Functional Group
Detection in Electron Ionization Mass Spectrometry |
title_sort | array-based machine learning for functional group
detection in electron ionization mass spectrometry |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10339417/ https://www.ncbi.nlm.nih.gov/pubmed/37457446 http://dx.doi.org/10.1021/acsomega.3c01684 |
work_keys_str_mv | AT northnicolem arraybasedmachinelearningforfunctionalgroupdetectioninelectronionizationmassspectrometry AT endersabigaila arraybasedmachinelearningforfunctionalgroupdetectioninelectronionizationmassspectrometry AT cablemorganl arraybasedmachinelearningforfunctionalgroupdetectioninelectronionizationmassspectrometry AT allenheatherc arraybasedmachinelearningforfunctionalgroupdetectioninelectronionizationmassspectrometry |