Cargando…

Collision Cross Section Prediction with Molecular Fingerprint Using Machine Learning

High-resolution mass spectrometry is a promising technique in non-target screening (NTS) to monitor contaminants of emerging concern in complex samples. Current chemical identification strategies in NTS experiments typically depend on spectral libraries, chemical databases, and in silico fragmentati...

Descripción completa

Detalles Bibliográficos
Autores principales: Yang, Fan, van Herwerden, Denice, Preud’homme, Hugues, Samanipour, Saer
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9572128/
https://www.ncbi.nlm.nih.gov/pubmed/36234961
http://dx.doi.org/10.3390/molecules27196424
_version_ 1784810536184053760
author Yang, Fan
van Herwerden, Denice
Preud’homme, Hugues
Samanipour, Saer
author_facet Yang, Fan
van Herwerden, Denice
Preud’homme, Hugues
Samanipour, Saer
author_sort Yang, Fan
collection PubMed
description High-resolution mass spectrometry is a promising technique in non-target screening (NTS) to monitor contaminants of emerging concern in complex samples. Current chemical identification strategies in NTS experiments typically depend on spectral libraries, chemical databases, and in silico fragmentation tools. However, small molecule identification remains challenging due to the lack of orthogonal sources of information (e.g., unique fragments). Collision cross section (CCS) values measured by ion mobility spectrometry (IMS) offer an additional identification dimension to increase the confidence level. Thanks to the advances in analytical instrumentation, an increasing application of IMS hybrid with high-resolution mass spectrometry (HRMS) in NTS has been reported in the recent decades. Several CCS prediction tools have been developed. However, limited CCS prediction methods were based on a large scale of chemical classes and cross-platform CCS measurements. We successfully developed two prediction models using a random forest machine learning algorithm. One of the approaches was based on chemicals’ super classes; the other model was direct CCS prediction using molecular fingerprint. Over 13,324 CCS values from six different laboratories and PubChem using a variety of ion-mobility separation techniques were used for training and testing the models. The test accuracy for all the prediction models was over 0.85, and the median of relative residual was around 2.2%. The models can be applied to different IMS platforms to eliminate false positives in small molecule identification.
format Online
Article
Text
id pubmed-9572128
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-95721282022-10-17 Collision Cross Section Prediction with Molecular Fingerprint Using Machine Learning Yang, Fan van Herwerden, Denice Preud’homme, Hugues Samanipour, Saer Molecules Article High-resolution mass spectrometry is a promising technique in non-target screening (NTS) to monitor contaminants of emerging concern in complex samples. Current chemical identification strategies in NTS experiments typically depend on spectral libraries, chemical databases, and in silico fragmentation tools. However, small molecule identification remains challenging due to the lack of orthogonal sources of information (e.g., unique fragments). Collision cross section (CCS) values measured by ion mobility spectrometry (IMS) offer an additional identification dimension to increase the confidence level. Thanks to the advances in analytical instrumentation, an increasing application of IMS hybrid with high-resolution mass spectrometry (HRMS) in NTS has been reported in the recent decades. Several CCS prediction tools have been developed. However, limited CCS prediction methods were based on a large scale of chemical classes and cross-platform CCS measurements. We successfully developed two prediction models using a random forest machine learning algorithm. One of the approaches was based on chemicals’ super classes; the other model was direct CCS prediction using molecular fingerprint. Over 13,324 CCS values from six different laboratories and PubChem using a variety of ion-mobility separation techniques were used for training and testing the models. The test accuracy for all the prediction models was over 0.85, and the median of relative residual was around 2.2%. The models can be applied to different IMS platforms to eliminate false positives in small molecule identification. MDPI 2022-09-29 /pmc/articles/PMC9572128/ /pubmed/36234961 http://dx.doi.org/10.3390/molecules27196424 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Yang, Fan
van Herwerden, Denice
Preud’homme, Hugues
Samanipour, Saer
Collision Cross Section Prediction with Molecular Fingerprint Using Machine Learning
title Collision Cross Section Prediction with Molecular Fingerprint Using Machine Learning
title_full Collision Cross Section Prediction with Molecular Fingerprint Using Machine Learning
title_fullStr Collision Cross Section Prediction with Molecular Fingerprint Using Machine Learning
title_full_unstemmed Collision Cross Section Prediction with Molecular Fingerprint Using Machine Learning
title_short Collision Cross Section Prediction with Molecular Fingerprint Using Machine Learning
title_sort collision cross section prediction with molecular fingerprint using machine learning
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9572128/
https://www.ncbi.nlm.nih.gov/pubmed/36234961
http://dx.doi.org/10.3390/molecules27196424
work_keys_str_mv AT yangfan collisioncrosssectionpredictionwithmolecularfingerprintusingmachinelearning
AT vanherwerdendenice collisioncrosssectionpredictionwithmolecularfingerprintusingmachinelearning
AT preudhommehugues collisioncrosssectionpredictionwithmolecularfingerprintusingmachinelearning
AT samanipoursaer collisioncrosssectionpredictionwithmolecularfingerprintusingmachinelearning