Cargando…

Evaluation of open search methods based on theoretical mass spectra comparison

BACKGROUND: Mass spectrometry remains the privileged method to characterize proteins. Nevertheless, most of the spectra generated by an experiment remain unidentified after their analysis, mostly because of the modifications they carry. Open Modification Search (OMS) methods offer a promising answer...

Descripción completa

Detalles Bibliográficos
Autores principales: Lysiak, Albane, Fertin, Guillaume, Jean, Géraldine, Tessier, Dominique
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8073971/
https://www.ncbi.nlm.nih.gov/pubmed/33902435
http://dx.doi.org/10.1186/s12859-021-03963-6
_version_ 1783684250899316736
author Lysiak, Albane
Fertin, Guillaume
Jean, Géraldine
Tessier, Dominique
author_facet Lysiak, Albane
Fertin, Guillaume
Jean, Géraldine
Tessier, Dominique
author_sort Lysiak, Albane
collection PubMed
description BACKGROUND: Mass spectrometry remains the privileged method to characterize proteins. Nevertheless, most of the spectra generated by an experiment remain unidentified after their analysis, mostly because of the modifications they carry. Open Modification Search (OMS) methods offer a promising answer to this problem. However, assessing the quality of OMS identifications remains a difficult task. METHODS: Aiming at better understanding the relationship between (1) similarity of pairs of spectra provided by OMS methods and (2) relevance of their corresponding peptide sequences, we used a dataset composed of theoretical spectra only, on which we applied two OMS strategies. We also introduced two appropriately defined measures for evaluating the above mentioned spectra/sequence relevance in this context: one is a color classification representing the level of difficulty to retrieve the proper sequence of the peptide that generated the identified spectrum ; the other, called LIPR, is the proportion of common masses, in a given Peptide Spectrum Match (PSM), that represent dissimilar sequences. These two measures were also considered in conjunction with the False Discovery Rate (FDR). RESULTS: According to our measures, the strategy that selects the best candidate by taking the mass difference between two spectra into account yields better quality results. Besides, although the FDR remains an interesting indicator in OMS methods (as shown by LIPR), it is questionable: indeed, our color classification shows that a non negligible proportion of relevant spectra/sequence interpretations corresponds to PSMs coming from the decoy database. CONCLUSIONS: The three above mentioned measures allowed us to clearly determine which of the two studied OMS strategies outperformed the other, both in terms of number of identifications and of accuracy of these identifications. Even though quality evaluation of PSMs in OMS methods remains challenging, the study of theoretical spectra is a favorable framework for going further in this direction.
format Online
Article
Text
id pubmed-8073971
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-80739712021-04-26 Evaluation of open search methods based on theoretical mass spectra comparison Lysiak, Albane Fertin, Guillaume Jean, Géraldine Tessier, Dominique BMC Bioinformatics Research BACKGROUND: Mass spectrometry remains the privileged method to characterize proteins. Nevertheless, most of the spectra generated by an experiment remain unidentified after their analysis, mostly because of the modifications they carry. Open Modification Search (OMS) methods offer a promising answer to this problem. However, assessing the quality of OMS identifications remains a difficult task. METHODS: Aiming at better understanding the relationship between (1) similarity of pairs of spectra provided by OMS methods and (2) relevance of their corresponding peptide sequences, we used a dataset composed of theoretical spectra only, on which we applied two OMS strategies. We also introduced two appropriately defined measures for evaluating the above mentioned spectra/sequence relevance in this context: one is a color classification representing the level of difficulty to retrieve the proper sequence of the peptide that generated the identified spectrum ; the other, called LIPR, is the proportion of common masses, in a given Peptide Spectrum Match (PSM), that represent dissimilar sequences. These two measures were also considered in conjunction with the False Discovery Rate (FDR). RESULTS: According to our measures, the strategy that selects the best candidate by taking the mass difference between two spectra into account yields better quality results. Besides, although the FDR remains an interesting indicator in OMS methods (as shown by LIPR), it is questionable: indeed, our color classification shows that a non negligible proportion of relevant spectra/sequence interpretations corresponds to PSMs coming from the decoy database. CONCLUSIONS: The three above mentioned measures allowed us to clearly determine which of the two studied OMS strategies outperformed the other, both in terms of number of identifications and of accuracy of these identifications. Even though quality evaluation of PSMs in OMS methods remains challenging, the study of theoretical spectra is a favorable framework for going further in this direction. BioMed Central 2021-04-26 /pmc/articles/PMC8073971/ /pubmed/33902435 http://dx.doi.org/10.1186/s12859-021-03963-6 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Lysiak, Albane
Fertin, Guillaume
Jean, Géraldine
Tessier, Dominique
Evaluation of open search methods based on theoretical mass spectra comparison
title Evaluation of open search methods based on theoretical mass spectra comparison
title_full Evaluation of open search methods based on theoretical mass spectra comparison
title_fullStr Evaluation of open search methods based on theoretical mass spectra comparison
title_full_unstemmed Evaluation of open search methods based on theoretical mass spectra comparison
title_short Evaluation of open search methods based on theoretical mass spectra comparison
title_sort evaluation of open search methods based on theoretical mass spectra comparison
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8073971/
https://www.ncbi.nlm.nih.gov/pubmed/33902435
http://dx.doi.org/10.1186/s12859-021-03963-6
work_keys_str_mv AT lysiakalbane evaluationofopensearchmethodsbasedontheoreticalmassspectracomparison
AT fertinguillaume evaluationofopensearchmethodsbasedontheoreticalmassspectracomparison
AT jeangeraldine evaluationofopensearchmethodsbasedontheoreticalmassspectracomparison
AT tessierdominique evaluationofopensearchmethodsbasedontheoreticalmassspectracomparison