Cargando…

A Comprehensive Evaluation of Consensus Spectrum Generation Methods in Proteomics

[Image: see text] Spectrum clustering is a powerful strategy to minimize redundant mass spectra by grouping them based on similarity, with the aim of forming groups of mass spectra from the same repeatedly measured analytes. Each such group of near-identical spectra can be represented by its so-call...

Descripción completa

Detalles Bibliográficos
Autores principales: Luo, Xiyang, Bittremieux, Wout, Griss, Johannes, Deutsch, Eric W., Sachsenberg, Timo, Levitsky, Lev I., Ivanov, Mark V., Bubis, Julia A., Gabriels, Ralf, Webel, Henry, Sanchez, Aniel, Bai, Mingze, Käll, Lukas, Perez-Riverol, Yasset
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Chemical Society 2022
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9171829/
https://www.ncbi.nlm.nih.gov/pubmed/35549218
http://dx.doi.org/10.1021/acs.jproteome.2c00069
_version_ 1784721754787151872
author Luo, Xiyang
Bittremieux, Wout
Griss, Johannes
Deutsch, Eric W.
Sachsenberg, Timo
Levitsky, Lev I.
Ivanov, Mark V.
Bubis, Julia A.
Gabriels, Ralf
Webel, Henry
Sanchez, Aniel
Bai, Mingze
Käll, Lukas
Perez-Riverol, Yasset
author_facet Luo, Xiyang
Bittremieux, Wout
Griss, Johannes
Deutsch, Eric W.
Sachsenberg, Timo
Levitsky, Lev I.
Ivanov, Mark V.
Bubis, Julia A.
Gabriels, Ralf
Webel, Henry
Sanchez, Aniel
Bai, Mingze
Käll, Lukas
Perez-Riverol, Yasset
author_sort Luo, Xiyang
collection PubMed
description [Image: see text] Spectrum clustering is a powerful strategy to minimize redundant mass spectra by grouping them based on similarity, with the aim of forming groups of mass spectra from the same repeatedly measured analytes. Each such group of near-identical spectra can be represented by its so-called consensus spectrum for downstream processing. Although several algorithms for spectrum clustering have been adequately benchmarked and tested, the influence of the consensus spectrum generation step is rarely evaluated. Here, we present an implementation and benchmark of common consensus spectrum algorithms, including spectrum averaging, spectrum binning, the most similar spectrum, and the best-identified spectrum. We have analyzed diverse public data sets using two different clustering algorithms (spectra-cluster and MaRaCluster) to evaluate how the consensus spectrum generation procedure influences downstream peptide identification. The BEST and BIN methods were found the most reliable methods for consensus spectrum generation, including for data sets with post-translational modifications (PTM) such as phosphorylation. All source code and data of the present study are freely available on GitHub at https://github.com/statisticalbiotechnology/representative-spectra-benchmark.
format Online
Article
Text
id pubmed-9171829
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher American Chemical Society
record_format MEDLINE/PubMed
spelling pubmed-91718292022-06-08 A Comprehensive Evaluation of Consensus Spectrum Generation Methods in Proteomics Luo, Xiyang Bittremieux, Wout Griss, Johannes Deutsch, Eric W. Sachsenberg, Timo Levitsky, Lev I. Ivanov, Mark V. Bubis, Julia A. Gabriels, Ralf Webel, Henry Sanchez, Aniel Bai, Mingze Käll, Lukas Perez-Riverol, Yasset J Proteome Res [Image: see text] Spectrum clustering is a powerful strategy to minimize redundant mass spectra by grouping them based on similarity, with the aim of forming groups of mass spectra from the same repeatedly measured analytes. Each such group of near-identical spectra can be represented by its so-called consensus spectrum for downstream processing. Although several algorithms for spectrum clustering have been adequately benchmarked and tested, the influence of the consensus spectrum generation step is rarely evaluated. Here, we present an implementation and benchmark of common consensus spectrum algorithms, including spectrum averaging, spectrum binning, the most similar spectrum, and the best-identified spectrum. We have analyzed diverse public data sets using two different clustering algorithms (spectra-cluster and MaRaCluster) to evaluate how the consensus spectrum generation procedure influences downstream peptide identification. The BEST and BIN methods were found the most reliable methods for consensus spectrum generation, including for data sets with post-translational modifications (PTM) such as phosphorylation. All source code and data of the present study are freely available on GitHub at https://github.com/statisticalbiotechnology/representative-spectra-benchmark. American Chemical Society 2022-05-13 2022-06-03 /pmc/articles/PMC9171829/ /pubmed/35549218 http://dx.doi.org/10.1021/acs.jproteome.2c00069 Text en © 2022 The Authors. Published by American Chemical Society https://creativecommons.org/licenses/by/4.0/Permits the broadest form of re-use including for commercial purposes, provided that author attribution and integrity are maintained (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Luo, Xiyang
Bittremieux, Wout
Griss, Johannes
Deutsch, Eric W.
Sachsenberg, Timo
Levitsky, Lev I.
Ivanov, Mark V.
Bubis, Julia A.
Gabriels, Ralf
Webel, Henry
Sanchez, Aniel
Bai, Mingze
Käll, Lukas
Perez-Riverol, Yasset
A Comprehensive Evaluation of Consensus Spectrum Generation Methods in Proteomics
title A Comprehensive Evaluation of Consensus Spectrum Generation Methods in Proteomics
title_full A Comprehensive Evaluation of Consensus Spectrum Generation Methods in Proteomics
title_fullStr A Comprehensive Evaluation of Consensus Spectrum Generation Methods in Proteomics
title_full_unstemmed A Comprehensive Evaluation of Consensus Spectrum Generation Methods in Proteomics
title_short A Comprehensive Evaluation of Consensus Spectrum Generation Methods in Proteomics
title_sort comprehensive evaluation of consensus spectrum generation methods in proteomics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9171829/
https://www.ncbi.nlm.nih.gov/pubmed/35549218
http://dx.doi.org/10.1021/acs.jproteome.2c00069
work_keys_str_mv AT luoxiyang acomprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics
AT bittremieuxwout acomprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics
AT grissjohannes acomprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics
AT deutschericw acomprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics
AT sachsenbergtimo acomprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics
AT levitskylevi acomprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics
AT ivanovmarkv acomprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics
AT bubisjuliaa acomprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics
AT gabrielsralf acomprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics
AT webelhenry acomprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics
AT sanchezaniel acomprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics
AT baimingze acomprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics
AT kalllukas acomprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics
AT perezriverolyasset acomprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics
AT luoxiyang comprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics
AT bittremieuxwout comprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics
AT grissjohannes comprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics
AT deutschericw comprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics
AT sachsenbergtimo comprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics
AT levitskylevi comprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics
AT ivanovmarkv comprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics
AT bubisjuliaa comprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics
AT gabrielsralf comprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics
AT webelhenry comprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics
AT sanchezaniel comprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics
AT baimingze comprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics
AT kalllukas comprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics
AT perezriverolyasset comprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics