Cargando…
A Comprehensive Evaluation of Consensus Spectrum Generation Methods in Proteomics
[Image: see text] Spectrum clustering is a powerful strategy to minimize redundant mass spectra by grouping them based on similarity, with the aim of forming groups of mass spectra from the same repeatedly measured analytes. Each such group of near-identical spectra can be represented by its so-call...
Autores principales: | , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Chemical Society
2022
|
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9171829/ https://www.ncbi.nlm.nih.gov/pubmed/35549218 http://dx.doi.org/10.1021/acs.jproteome.2c00069 |
_version_ | 1784721754787151872 |
---|---|
author | Luo, Xiyang Bittremieux, Wout Griss, Johannes Deutsch, Eric W. Sachsenberg, Timo Levitsky, Lev I. Ivanov, Mark V. Bubis, Julia A. Gabriels, Ralf Webel, Henry Sanchez, Aniel Bai, Mingze Käll, Lukas Perez-Riverol, Yasset |
author_facet | Luo, Xiyang Bittremieux, Wout Griss, Johannes Deutsch, Eric W. Sachsenberg, Timo Levitsky, Lev I. Ivanov, Mark V. Bubis, Julia A. Gabriels, Ralf Webel, Henry Sanchez, Aniel Bai, Mingze Käll, Lukas Perez-Riverol, Yasset |
author_sort | Luo, Xiyang |
collection | PubMed |
description | [Image: see text] Spectrum clustering is a powerful strategy to minimize redundant mass spectra by grouping them based on similarity, with the aim of forming groups of mass spectra from the same repeatedly measured analytes. Each such group of near-identical spectra can be represented by its so-called consensus spectrum for downstream processing. Although several algorithms for spectrum clustering have been adequately benchmarked and tested, the influence of the consensus spectrum generation step is rarely evaluated. Here, we present an implementation and benchmark of common consensus spectrum algorithms, including spectrum averaging, spectrum binning, the most similar spectrum, and the best-identified spectrum. We have analyzed diverse public data sets using two different clustering algorithms (spectra-cluster and MaRaCluster) to evaluate how the consensus spectrum generation procedure influences downstream peptide identification. The BEST and BIN methods were found the most reliable methods for consensus spectrum generation, including for data sets with post-translational modifications (PTM) such as phosphorylation. All source code and data of the present study are freely available on GitHub at https://github.com/statisticalbiotechnology/representative-spectra-benchmark. |
format | Online Article Text |
id | pubmed-9171829 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | American Chemical Society |
record_format | MEDLINE/PubMed |
spelling | pubmed-91718292022-06-08 A Comprehensive Evaluation of Consensus Spectrum Generation Methods in Proteomics Luo, Xiyang Bittremieux, Wout Griss, Johannes Deutsch, Eric W. Sachsenberg, Timo Levitsky, Lev I. Ivanov, Mark V. Bubis, Julia A. Gabriels, Ralf Webel, Henry Sanchez, Aniel Bai, Mingze Käll, Lukas Perez-Riverol, Yasset J Proteome Res [Image: see text] Spectrum clustering is a powerful strategy to minimize redundant mass spectra by grouping them based on similarity, with the aim of forming groups of mass spectra from the same repeatedly measured analytes. Each such group of near-identical spectra can be represented by its so-called consensus spectrum for downstream processing. Although several algorithms for spectrum clustering have been adequately benchmarked and tested, the influence of the consensus spectrum generation step is rarely evaluated. Here, we present an implementation and benchmark of common consensus spectrum algorithms, including spectrum averaging, spectrum binning, the most similar spectrum, and the best-identified spectrum. We have analyzed diverse public data sets using two different clustering algorithms (spectra-cluster and MaRaCluster) to evaluate how the consensus spectrum generation procedure influences downstream peptide identification. The BEST and BIN methods were found the most reliable methods for consensus spectrum generation, including for data sets with post-translational modifications (PTM) such as phosphorylation. All source code and data of the present study are freely available on GitHub at https://github.com/statisticalbiotechnology/representative-spectra-benchmark. American Chemical Society 2022-05-13 2022-06-03 /pmc/articles/PMC9171829/ /pubmed/35549218 http://dx.doi.org/10.1021/acs.jproteome.2c00069 Text en © 2022 The Authors. Published by American Chemical Society https://creativecommons.org/licenses/by/4.0/Permits the broadest form of re-use including for commercial purposes, provided that author attribution and integrity are maintained (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Luo, Xiyang Bittremieux, Wout Griss, Johannes Deutsch, Eric W. Sachsenberg, Timo Levitsky, Lev I. Ivanov, Mark V. Bubis, Julia A. Gabriels, Ralf Webel, Henry Sanchez, Aniel Bai, Mingze Käll, Lukas Perez-Riverol, Yasset A Comprehensive Evaluation of Consensus Spectrum Generation Methods in Proteomics |
title | A Comprehensive
Evaluation of Consensus Spectrum Generation
Methods in Proteomics |
title_full | A Comprehensive
Evaluation of Consensus Spectrum Generation
Methods in Proteomics |
title_fullStr | A Comprehensive
Evaluation of Consensus Spectrum Generation
Methods in Proteomics |
title_full_unstemmed | A Comprehensive
Evaluation of Consensus Spectrum Generation
Methods in Proteomics |
title_short | A Comprehensive
Evaluation of Consensus Spectrum Generation
Methods in Proteomics |
title_sort | comprehensive
evaluation of consensus spectrum generation
methods in proteomics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9171829/ https://www.ncbi.nlm.nih.gov/pubmed/35549218 http://dx.doi.org/10.1021/acs.jproteome.2c00069 |
work_keys_str_mv | AT luoxiyang acomprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics AT bittremieuxwout acomprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics AT grissjohannes acomprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics AT deutschericw acomprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics AT sachsenbergtimo acomprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics AT levitskylevi acomprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics AT ivanovmarkv acomprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics AT bubisjuliaa acomprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics AT gabrielsralf acomprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics AT webelhenry acomprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics AT sanchezaniel acomprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics AT baimingze acomprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics AT kalllukas acomprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics AT perezriverolyasset acomprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics AT luoxiyang comprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics AT bittremieuxwout comprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics AT grissjohannes comprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics AT deutschericw comprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics AT sachsenbergtimo comprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics AT levitskylevi comprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics AT ivanovmarkv comprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics AT bubisjuliaa comprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics AT gabrielsralf comprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics AT webelhenry comprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics AT sanchezaniel comprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics AT baimingze comprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics AT kalllukas comprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics AT perezriverolyasset comprehensiveevaluationofconsensusspectrumgenerationmethodsinproteomics |