Cargando…

MS2Query: reliable and scalable MS(2) mass spectra-based analogue search

Metabolomics-driven discoveries of biological samples remain hampered by the grand challenge of metabolite annotation and identification. Only few metabolites have an annotated spectrum in spectral libraries; hence, searching only for exact library matches generally returns a few hits. An attractive...

Descripción completa

Detalles Bibliográficos
Autores principales: de Jonge, Niek F., Louwen, Joris J. R., Chekmeneva, Elena, Camuzeaux, Stephane, Vermeir, Femke J., Jansen, Robert S., Huber, Florian, van der Hooft, Justin J. J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group UK 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10060387/
https://www.ncbi.nlm.nih.gov/pubmed/36990978
http://dx.doi.org/10.1038/s41467-023-37446-4
_version_ 1785017086753374208
author de Jonge, Niek F.
Louwen, Joris J. R.
Chekmeneva, Elena
Camuzeaux, Stephane
Vermeir, Femke J.
Jansen, Robert S.
Huber, Florian
van der Hooft, Justin J. J.
author_facet de Jonge, Niek F.
Louwen, Joris J. R.
Chekmeneva, Elena
Camuzeaux, Stephane
Vermeir, Femke J.
Jansen, Robert S.
Huber, Florian
van der Hooft, Justin J. J.
author_sort de Jonge, Niek F.
collection PubMed
description Metabolomics-driven discoveries of biological samples remain hampered by the grand challenge of metabolite annotation and identification. Only few metabolites have an annotated spectrum in spectral libraries; hence, searching only for exact library matches generally returns a few hits. An attractive alternative is searching for so-called analogues as a starting point for structural annotations; analogues are library molecules which are not exact matches but display a high chemical similarity. However, current analogue search implementations are not yet very reliable and relatively slow. Here, we present MS2Query, a machine learning-based tool that integrates mass spectral embedding-based chemical similarity predictors (Spec2Vec and MS2Deepscore) as well as detected precursor masses to rank potential analogues and exact matches. Benchmarking MS2Query on reference mass spectra and experimental case studies demonstrate improved reliability and scalability. Thereby, MS2Query offers exciting opportunities to further increase the annotation rate of metabolomics profiles of complex metabolite mixtures and to discover new biology.
format Online
Article
Text
id pubmed-10060387
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Nature Publishing Group UK
record_format MEDLINE/PubMed
spelling pubmed-100603872023-03-31 MS2Query: reliable and scalable MS(2) mass spectra-based analogue search de Jonge, Niek F. Louwen, Joris J. R. Chekmeneva, Elena Camuzeaux, Stephane Vermeir, Femke J. Jansen, Robert S. Huber, Florian van der Hooft, Justin J. J. Nat Commun Article Metabolomics-driven discoveries of biological samples remain hampered by the grand challenge of metabolite annotation and identification. Only few metabolites have an annotated spectrum in spectral libraries; hence, searching only for exact library matches generally returns a few hits. An attractive alternative is searching for so-called analogues as a starting point for structural annotations; analogues are library molecules which are not exact matches but display a high chemical similarity. However, current analogue search implementations are not yet very reliable and relatively slow. Here, we present MS2Query, a machine learning-based tool that integrates mass spectral embedding-based chemical similarity predictors (Spec2Vec and MS2Deepscore) as well as detected precursor masses to rank potential analogues and exact matches. Benchmarking MS2Query on reference mass spectra and experimental case studies demonstrate improved reliability and scalability. Thereby, MS2Query offers exciting opportunities to further increase the annotation rate of metabolomics profiles of complex metabolite mixtures and to discover new biology. Nature Publishing Group UK 2023-03-29 /pmc/articles/PMC10060387/ /pubmed/36990978 http://dx.doi.org/10.1038/s41467-023-37446-4 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle Article
de Jonge, Niek F.
Louwen, Joris J. R.
Chekmeneva, Elena
Camuzeaux, Stephane
Vermeir, Femke J.
Jansen, Robert S.
Huber, Florian
van der Hooft, Justin J. J.
MS2Query: reliable and scalable MS(2) mass spectra-based analogue search
title MS2Query: reliable and scalable MS(2) mass spectra-based analogue search
title_full MS2Query: reliable and scalable MS(2) mass spectra-based analogue search
title_fullStr MS2Query: reliable and scalable MS(2) mass spectra-based analogue search
title_full_unstemmed MS2Query: reliable and scalable MS(2) mass spectra-based analogue search
title_short MS2Query: reliable and scalable MS(2) mass spectra-based analogue search
title_sort ms2query: reliable and scalable ms(2) mass spectra-based analogue search
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10060387/
https://www.ncbi.nlm.nih.gov/pubmed/36990978
http://dx.doi.org/10.1038/s41467-023-37446-4
work_keys_str_mv AT dejongeniekf ms2queryreliableandscalablems2massspectrabasedanaloguesearch
AT louwenjorisjr ms2queryreliableandscalablems2massspectrabasedanaloguesearch
AT chekmenevaelena ms2queryreliableandscalablems2massspectrabasedanaloguesearch
AT camuzeauxstephane ms2queryreliableandscalablems2massspectrabasedanaloguesearch
AT vermeirfemkej ms2queryreliableandscalablems2massspectrabasedanaloguesearch
AT jansenroberts ms2queryreliableandscalablems2massspectrabasedanaloguesearch
AT huberflorian ms2queryreliableandscalablems2massspectrabasedanaloguesearch
AT vanderhooftjustinjj ms2queryreliableandscalablems2massspectrabasedanaloguesearch