Cargando…

Measuring pathway database coverage of the phosphoproteome

Protein phosphorylation is one of the best known post-translational mechanisms playing a key role in the regulation of cellular processes. Over 100,000 distinct phosphorylation sites have been discovered through constant improvement of mass spectrometry based phosphoproteomics in the last decade. Ho...

Descripción completa

Detalles Bibliográficos
Autores principales: Huckstep, Hannah, Fearnley, Liam G., Davis, Melissa J.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: PeerJ Inc. 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8162239/
https://www.ncbi.nlm.nih.gov/pubmed/34113485
http://dx.doi.org/10.7717/peerj.11298
_version_ 1783700669522247680
author Huckstep, Hannah
Fearnley, Liam G.
Davis, Melissa J.
author_facet Huckstep, Hannah
Fearnley, Liam G.
Davis, Melissa J.
author_sort Huckstep, Hannah
collection PubMed
description Protein phosphorylation is one of the best known post-translational mechanisms playing a key role in the regulation of cellular processes. Over 100,000 distinct phosphorylation sites have been discovered through constant improvement of mass spectrometry based phosphoproteomics in the last decade. However, data saturation is occurring and the bottleneck of assigning biologically relevant functionality to phosphosites needs to be addressed. There has been finite success in using data-driven approaches to reveal phosphosite functionality due to a range of limitations. The alternate, more suitable approach is making use of prior knowledge from literature-derived databases. Here, we analysed seven widely used databases to shed light on their suitability to provide functional insights into phosphoproteomics data. We first determined the global coverage of each database at both the protein and phosphosite level. We also determined how consistent each database was in its phosphorylation annotations compared to a global standard. Finally, we looked in detail at the coverage of each database over six experimental datasets. Our analysis highlights the relative strengths and weaknesses of each database, providing a guide in how each can be best used to identify biological mechanisms in phosphoproteomic data.
format Online
Article
Text
id pubmed-8162239
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher PeerJ Inc.
record_format MEDLINE/PubMed
spelling pubmed-81622392021-06-09 Measuring pathway database coverage of the phosphoproteome Huckstep, Hannah Fearnley, Liam G. Davis, Melissa J. PeerJ Bioinformatics Protein phosphorylation is one of the best known post-translational mechanisms playing a key role in the regulation of cellular processes. Over 100,000 distinct phosphorylation sites have been discovered through constant improvement of mass spectrometry based phosphoproteomics in the last decade. However, data saturation is occurring and the bottleneck of assigning biologically relevant functionality to phosphosites needs to be addressed. There has been finite success in using data-driven approaches to reveal phosphosite functionality due to a range of limitations. The alternate, more suitable approach is making use of prior knowledge from literature-derived databases. Here, we analysed seven widely used databases to shed light on their suitability to provide functional insights into phosphoproteomics data. We first determined the global coverage of each database at both the protein and phosphosite level. We also determined how consistent each database was in its phosphorylation annotations compared to a global standard. Finally, we looked in detail at the coverage of each database over six experimental datasets. Our analysis highlights the relative strengths and weaknesses of each database, providing a guide in how each can be best used to identify biological mechanisms in phosphoproteomic data. PeerJ Inc. 2021-05-25 /pmc/articles/PMC8162239/ /pubmed/34113485 http://dx.doi.org/10.7717/peerj.11298 Text en © 2021 Huckstep et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ) and either DOI or URL of the article must be cited.
spellingShingle Bioinformatics
Huckstep, Hannah
Fearnley, Liam G.
Davis, Melissa J.
Measuring pathway database coverage of the phosphoproteome
title Measuring pathway database coverage of the phosphoproteome
title_full Measuring pathway database coverage of the phosphoproteome
title_fullStr Measuring pathway database coverage of the phosphoproteome
title_full_unstemmed Measuring pathway database coverage of the phosphoproteome
title_short Measuring pathway database coverage of the phosphoproteome
title_sort measuring pathway database coverage of the phosphoproteome
topic Bioinformatics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8162239/
https://www.ncbi.nlm.nih.gov/pubmed/34113485
http://dx.doi.org/10.7717/peerj.11298
work_keys_str_mv AT huckstephannah measuringpathwaydatabasecoverageofthephosphoproteome
AT fearnleyliamg measuringpathwaydatabasecoverageofthephosphoproteome
AT davismelissaj measuringpathwaydatabasecoverageofthephosphoproteome