Cargando…

ProMEX: a mass spectral reference database for proteins and protein phosphorylation sites

BACKGROUND: In the last decade, techniques were established for the large scale genome-wide analysis of proteins, RNA, and metabolites, and database solutions have been developed to manage the generated data sets. The Golm Metabolome Database for metabolite data (GMD) represents one such effort to m...

Descripción completa

Detalles Bibliográficos
Autores principales: Hummel, Jan, Niemann, Michaela, Wienkoop, Stefanie, Schulze, Waltraud, Steinhauser, Dirk, Selbig, Joachim, Walther, Dirk, Weckwerth, Wolfram
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1920535/
https://www.ncbi.nlm.nih.gov/pubmed/17587460
http://dx.doi.org/10.1186/1471-2105-8-216
_version_ 1782134202132594688
author Hummel, Jan
Niemann, Michaela
Wienkoop, Stefanie
Schulze, Waltraud
Steinhauser, Dirk
Selbig, Joachim
Walther, Dirk
Weckwerth, Wolfram
author_facet Hummel, Jan
Niemann, Michaela
Wienkoop, Stefanie
Schulze, Waltraud
Steinhauser, Dirk
Selbig, Joachim
Walther, Dirk
Weckwerth, Wolfram
author_sort Hummel, Jan
collection PubMed
description BACKGROUND: In the last decade, techniques were established for the large scale genome-wide analysis of proteins, RNA, and metabolites, and database solutions have been developed to manage the generated data sets. The Golm Metabolome Database for metabolite data (GMD) represents one such effort to make these data broadly available and to interconnect the different molecular levels of a biological system [1]. As data interpretation in the light of already existing data becomes increasingly important, these initiatives are an essential part of current and future systems biology. RESULTS: A mass spectral library consisting of experimentally derived tryptic peptide product ion spectra was generated based on liquid chromatography coupled to ion trap mass spectrometry (LC-IT-MS). Protein samples derived from Arabidopsis thaliana, Chlamydomonas reinhardii, Medicago truncatula, and Sinorhizobium meliloti were analysed. With currently 4,557 manually validated spectra associated with 4,226 unique peptides from 1,367 proteins, the database serves as a continuously growing reference data set and can be used for protein identification and quantification in uncharacterized biological samples. For peptide identification, several algorithms were implemented based on a recently published study for peptide mass fingerprinting [2] and tested for false positive and negative rates. An algorithm which considers intensity distribution for match correlation scores was found to yield best results. For proof of concept, an LC-IT-MS analysis of a tryptic leaf protein digest was converted to mzData format and searched against the mass spectral library. The utility of the mass spectral library was also tested for the identification of phosphorylated tryptic peptides. We included in vivo phosphorylation sites of Arabidopsis thaliana proteins and the identification performance was found to be improved compared to genome-based search algorithms. Protein identification by ProMEX is linked to other levels of biological organization such as metabolite, pathway, and transcript data. The database is further connected to annotation and classification services via BioMoby. CONCLUSION: The ProMEX protein/peptide database represents a mass spectral reference library with the capability of matching unknown samples for protein identification. The database allows text searches based on metadata such as experimental information of the samples, mass spectrometric instrument parameters or unique protein identifier like AGI codes. ProMEX integrates proteomics data with other levels of molecular organization including metabolite, pathway, and transcript information and may thus become a useful resource for plant systems biology studies. The ProMEX mass spectral library is available at .
format Text
id pubmed-1920535
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-19205352007-07-17 ProMEX: a mass spectral reference database for proteins and protein phosphorylation sites Hummel, Jan Niemann, Michaela Wienkoop, Stefanie Schulze, Waltraud Steinhauser, Dirk Selbig, Joachim Walther, Dirk Weckwerth, Wolfram BMC Bioinformatics Database BACKGROUND: In the last decade, techniques were established for the large scale genome-wide analysis of proteins, RNA, and metabolites, and database solutions have been developed to manage the generated data sets. The Golm Metabolome Database for metabolite data (GMD) represents one such effort to make these data broadly available and to interconnect the different molecular levels of a biological system [1]. As data interpretation in the light of already existing data becomes increasingly important, these initiatives are an essential part of current and future systems biology. RESULTS: A mass spectral library consisting of experimentally derived tryptic peptide product ion spectra was generated based on liquid chromatography coupled to ion trap mass spectrometry (LC-IT-MS). Protein samples derived from Arabidopsis thaliana, Chlamydomonas reinhardii, Medicago truncatula, and Sinorhizobium meliloti were analysed. With currently 4,557 manually validated spectra associated with 4,226 unique peptides from 1,367 proteins, the database serves as a continuously growing reference data set and can be used for protein identification and quantification in uncharacterized biological samples. For peptide identification, several algorithms were implemented based on a recently published study for peptide mass fingerprinting [2] and tested for false positive and negative rates. An algorithm which considers intensity distribution for match correlation scores was found to yield best results. For proof of concept, an LC-IT-MS analysis of a tryptic leaf protein digest was converted to mzData format and searched against the mass spectral library. The utility of the mass spectral library was also tested for the identification of phosphorylated tryptic peptides. We included in vivo phosphorylation sites of Arabidopsis thaliana proteins and the identification performance was found to be improved compared to genome-based search algorithms. Protein identification by ProMEX is linked to other levels of biological organization such as metabolite, pathway, and transcript data. The database is further connected to annotation and classification services via BioMoby. CONCLUSION: The ProMEX protein/peptide database represents a mass spectral reference library with the capability of matching unknown samples for protein identification. The database allows text searches based on metadata such as experimental information of the samples, mass spectrometric instrument parameters or unique protein identifier like AGI codes. ProMEX integrates proteomics data with other levels of molecular organization including metabolite, pathway, and transcript information and may thus become a useful resource for plant systems biology studies. The ProMEX mass spectral library is available at . BioMed Central 2007-06-23 /pmc/articles/PMC1920535/ /pubmed/17587460 http://dx.doi.org/10.1186/1471-2105-8-216 Text en Copyright © 2007 Hummel et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Database
Hummel, Jan
Niemann, Michaela
Wienkoop, Stefanie
Schulze, Waltraud
Steinhauser, Dirk
Selbig, Joachim
Walther, Dirk
Weckwerth, Wolfram
ProMEX: a mass spectral reference database for proteins and protein phosphorylation sites
title ProMEX: a mass spectral reference database for proteins and protein phosphorylation sites
title_full ProMEX: a mass spectral reference database for proteins and protein phosphorylation sites
title_fullStr ProMEX: a mass spectral reference database for proteins and protein phosphorylation sites
title_full_unstemmed ProMEX: a mass spectral reference database for proteins and protein phosphorylation sites
title_short ProMEX: a mass spectral reference database for proteins and protein phosphorylation sites
title_sort promex: a mass spectral reference database for proteins and protein phosphorylation sites
topic Database
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1920535/
https://www.ncbi.nlm.nih.gov/pubmed/17587460
http://dx.doi.org/10.1186/1471-2105-8-216
work_keys_str_mv AT hummeljan promexamassspectralreferencedatabaseforproteinsandproteinphosphorylationsites
AT niemannmichaela promexamassspectralreferencedatabaseforproteinsandproteinphosphorylationsites
AT wienkoopstefanie promexamassspectralreferencedatabaseforproteinsandproteinphosphorylationsites
AT schulzewaltraud promexamassspectralreferencedatabaseforproteinsandproteinphosphorylationsites
AT steinhauserdirk promexamassspectralreferencedatabaseforproteinsandproteinphosphorylationsites
AT selbigjoachim promexamassspectralreferencedatabaseforproteinsandproteinphosphorylationsites
AT waltherdirk promexamassspectralreferencedatabaseforproteinsandproteinphosphorylationsites
AT weckwerthwolfram promexamassspectralreferencedatabaseforproteinsandproteinphosphorylationsites