Cargando…

The Need for Community Standards to Enable Accurate Comparison of Glycoproteomics Algorithm Performance

Protein glycosylation that mediates interactions among viral proteins, host receptors, and immune molecules is an important consideration for predicting viral antigenicity. Viral spike proteins, the proteins responsible for host cell invasion, are especially important to be examined. However, there...

Descripción completa

Detalles Bibliográficos
Autores principales: Hackett, William E., Zaia, Joseph
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8398183/
https://www.ncbi.nlm.nih.gov/pubmed/34443345
http://dx.doi.org/10.3390/molecules26164757
_version_ 1783744777255124992
author Hackett, William E.
Zaia, Joseph
author_facet Hackett, William E.
Zaia, Joseph
author_sort Hackett, William E.
collection PubMed
description Protein glycosylation that mediates interactions among viral proteins, host receptors, and immune molecules is an important consideration for predicting viral antigenicity. Viral spike proteins, the proteins responsible for host cell invasion, are especially important to be examined. However, there is a lack of consensus within the field of glycoproteomics regarding identification strategy and false discovery rate (FDR) calculation that impedes our examinations. As a case study in the overlap between software, here as a case study, we examine recently published SARS-CoV-2 glycoprotein datasets with four glycoproteomics identification software with their recommended protocols: GlycReSoft, Byonic, pGlyco2, and MSFragger-Glyco. These software use different Target-Decoy Analysis (TDA) forms to estimate FDR and have different database-oriented search methods with varying degrees of quantification capabilities. Instead of an ideal overlap between software, we observed different sets of identifications with the intersection. When clustering by glycopeptide identifications, we see higher degrees of relatedness within software than within glycosites. Taking the consensus between results yields a conservative and non-informative conclusion as we lose identifications in the desire for caution; these non-consensus identifications are often lower abundance and, therefore, more susceptible to nuanced changes. We conclude that present glycoproteomics softwares are not directly comparable, and that methods are needed to assess their overall results and FDR estimation performance. Once such tools are developed, it will be possible to improve FDR methods and quantify complex glycoproteomes with acceptable confidence, rather than potentially misleading broad strokes.
format Online
Article
Text
id pubmed-8398183
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-83981832021-08-29 The Need for Community Standards to Enable Accurate Comparison of Glycoproteomics Algorithm Performance Hackett, William E. Zaia, Joseph Molecules Article Protein glycosylation that mediates interactions among viral proteins, host receptors, and immune molecules is an important consideration for predicting viral antigenicity. Viral spike proteins, the proteins responsible for host cell invasion, are especially important to be examined. However, there is a lack of consensus within the field of glycoproteomics regarding identification strategy and false discovery rate (FDR) calculation that impedes our examinations. As a case study in the overlap between software, here as a case study, we examine recently published SARS-CoV-2 glycoprotein datasets with four glycoproteomics identification software with their recommended protocols: GlycReSoft, Byonic, pGlyco2, and MSFragger-Glyco. These software use different Target-Decoy Analysis (TDA) forms to estimate FDR and have different database-oriented search methods with varying degrees of quantification capabilities. Instead of an ideal overlap between software, we observed different sets of identifications with the intersection. When clustering by glycopeptide identifications, we see higher degrees of relatedness within software than within glycosites. Taking the consensus between results yields a conservative and non-informative conclusion as we lose identifications in the desire for caution; these non-consensus identifications are often lower abundance and, therefore, more susceptible to nuanced changes. We conclude that present glycoproteomics softwares are not directly comparable, and that methods are needed to assess their overall results and FDR estimation performance. Once such tools are developed, it will be possible to improve FDR methods and quantify complex glycoproteomes with acceptable confidence, rather than potentially misleading broad strokes. MDPI 2021-08-06 /pmc/articles/PMC8398183/ /pubmed/34443345 http://dx.doi.org/10.3390/molecules26164757 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Hackett, William E.
Zaia, Joseph
The Need for Community Standards to Enable Accurate Comparison of Glycoproteomics Algorithm Performance
title The Need for Community Standards to Enable Accurate Comparison of Glycoproteomics Algorithm Performance
title_full The Need for Community Standards to Enable Accurate Comparison of Glycoproteomics Algorithm Performance
title_fullStr The Need for Community Standards to Enable Accurate Comparison of Glycoproteomics Algorithm Performance
title_full_unstemmed The Need for Community Standards to Enable Accurate Comparison of Glycoproteomics Algorithm Performance
title_short The Need for Community Standards to Enable Accurate Comparison of Glycoproteomics Algorithm Performance
title_sort need for community standards to enable accurate comparison of glycoproteomics algorithm performance
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8398183/
https://www.ncbi.nlm.nih.gov/pubmed/34443345
http://dx.doi.org/10.3390/molecules26164757
work_keys_str_mv AT hackettwilliame theneedforcommunitystandardstoenableaccuratecomparisonofglycoproteomicsalgorithmperformance
AT zaiajoseph theneedforcommunitystandardstoenableaccuratecomparisonofglycoproteomicsalgorithmperformance
AT hackettwilliame needforcommunitystandardstoenableaccuratecomparisonofglycoproteomicsalgorithmperformance
AT zaiajoseph needforcommunitystandardstoenableaccuratecomparisonofglycoproteomicsalgorithmperformance