Cargando…

MSNovelist: de novo structure generation from mass spectra

Current methods for structure elucidation of small molecules rely on finding similarity with spectra of known compounds, but do not predict structures de novo for unknown compound classes. We present MSNovelist, which combines fingerprint prediction with an encoder–decoder neural network to generate...

Descripción completa

Detalles Bibliográficos
Autores principales: Stravs, Michael A., Dührkop, Kai, Böcker, Sebastian, Zamboni, Nicola
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group US 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9262714/
https://www.ncbi.nlm.nih.gov/pubmed/35637304
http://dx.doi.org/10.1038/s41592-022-01486-3
Descripción
Sumario:Current methods for structure elucidation of small molecules rely on finding similarity with spectra of known compounds, but do not predict structures de novo for unknown compound classes. We present MSNovelist, which combines fingerprint prediction with an encoder–decoder neural network to generate structures de novo solely from tandem mass spectrometry (MS(2)) spectra. In an evaluation with 3,863 MS(2) spectra from the Global Natural Product Social Molecular Networking site, MSNovelist predicted 25% of structures correctly on first rank, retrieved 45% of structures overall and reproduced 61% of correct database annotations, without having ever seen the structure in the training phase. Similarly, for the CASMI 2016 challenge, MSNovelist correctly predicted 26% and retrieved 57% of structures, recovering 64% of correct database annotations. Finally, we illustrate the application of MSNovelist in a bryophyte MS(2) dataset, in which de novo structure prediction substantially outscored the best database candidate for seven spectra. MSNovelist is ideally suited to complement library-based annotation in the case of poorly represented analyte classes and novel compounds.