Cargando…

Prosit Transformer: A transformer for Prediction of MS2 Spectrum Intensities

[Image: see text] Machine learning has been an integral part of interpreting data from mass spectrometry (MS)-based proteomics for a long time. Relatively recently, a machine-learning structure appeared successful in other areas of bioinformatics, Transformers. Furthermore, the implementation of Tra...

Descripción completa

Detalles Bibliográficos
Autores principales: Ekvall, Markus, Truong, Patrick, Gabriel, Wassim, Wilhelm, Mathias, Käll, Lukas
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2022
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9087333/
https://www.ncbi.nlm.nih.gov/pubmed/35413196
http://dx.doi.org/10.1021/acs.jproteome.1c00870
_version_ 1784704181082259456
author Ekvall, Markus
Truong, Patrick
Gabriel, Wassim
Wilhelm, Mathias
Käll, Lukas
author_facet Ekvall, Markus
Truong, Patrick
Gabriel, Wassim
Wilhelm, Mathias
Käll, Lukas
author_sort Ekvall, Markus
collection PubMed
description [Image: see text] Machine learning has been an integral part of interpreting data from mass spectrometry (MS)-based proteomics for a long time. Relatively recently, a machine-learning structure appeared successful in other areas of bioinformatics, Transformers. Furthermore, the implementation of Transformers within bioinformatics has become relatively convenient due to transfer learning, i.e., adapting a network trained for other tasks to new functionality. Transfer learning makes these relatively large networks more accessible as it generally requires less data, and the training time improves substantially. We implemented a Transformer based on the pretrained model TAPE to predict MS2 intensities. TAPE is a general model trained to predict missing residues from protein sequences. Despite being trained for a different task, we could modify its behavior by adding a prediction head at the end of the TAPE model and fine-tune it using the spectrum intensity from the training set to the well-known predictor Prosit. We demonstrate that the predictor, which we call Prosit Transformer, outperforms the recurrent neural-network-based predictor Prosit, increasing the median angular similarity on its hold-out set from 0.908 to 0.929. We believe that Transformers will significantly increase prediction accuracy for other types of predictions within MS-based proteomics.
format Online
Article
Text
id pubmed-9087333
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher American Chemical Society
record_format MEDLINE/PubMed
spelling pubmed-90873332022-05-11 Prosit Transformer: A transformer for Prediction of MS2 Spectrum Intensities Ekvall, Markus Truong, Patrick Gabriel, Wassim Wilhelm, Mathias Käll, Lukas J Proteome Res [Image: see text] Machine learning has been an integral part of interpreting data from mass spectrometry (MS)-based proteomics for a long time. Relatively recently, a machine-learning structure appeared successful in other areas of bioinformatics, Transformers. Furthermore, the implementation of Transformers within bioinformatics has become relatively convenient due to transfer learning, i.e., adapting a network trained for other tasks to new functionality. Transfer learning makes these relatively large networks more accessible as it generally requires less data, and the training time improves substantially. We implemented a Transformer based on the pretrained model TAPE to predict MS2 intensities. TAPE is a general model trained to predict missing residues from protein sequences. Despite being trained for a different task, we could modify its behavior by adding a prediction head at the end of the TAPE model and fine-tune it using the spectrum intensity from the training set to the well-known predictor Prosit. We demonstrate that the predictor, which we call Prosit Transformer, outperforms the recurrent neural-network-based predictor Prosit, increasing the median angular similarity on its hold-out set from 0.908 to 0.929. We believe that Transformers will significantly increase prediction accuracy for other types of predictions within MS-based proteomics. American Chemical Society 2022-04-12 2022-05-06 /pmc/articles/PMC9087333/ /pubmed/35413196 http://dx.doi.org/10.1021/acs.jproteome.1c00870 Text en © 2022 The Authors. Published by American Chemical Society https://creativecommons.org/licenses/by/4.0/Permits the broadest form of re-use including for commercial purposes, provided that author attribution and integrity are maintained (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Ekvall, Markus
Truong, Patrick
Gabriel, Wassim
Wilhelm, Mathias
Käll, Lukas
Prosit Transformer: A transformer for Prediction of MS2 Spectrum Intensities
title Prosit Transformer: A transformer for Prediction of MS2 Spectrum Intensities
title_full Prosit Transformer: A transformer for Prediction of MS2 Spectrum Intensities
title_fullStr Prosit Transformer: A transformer for Prediction of MS2 Spectrum Intensities
title_full_unstemmed Prosit Transformer: A transformer for Prediction of MS2 Spectrum Intensities
title_short Prosit Transformer: A transformer for Prediction of MS2 Spectrum Intensities
title_sort prosit transformer: a transformer for prediction of ms2 spectrum intensities
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9087333/
https://www.ncbi.nlm.nih.gov/pubmed/35413196
http://dx.doi.org/10.1021/acs.jproteome.1c00870
work_keys_str_mv AT ekvallmarkus prosittransformeratransformerforpredictionofms2spectrumintensities
AT truongpatrick prosittransformeratransformerforpredictionofms2spectrumintensities
AT gabrielwassim prosittransformeratransformerforpredictionofms2spectrumintensities
AT wilhelmmathias prosittransformeratransformerforpredictionofms2spectrumintensities
AT kalllukas prosittransformeratransformerforpredictionofms2spectrumintensities