Cargando…

Analysis of High Accuracy, Quantitative Proteomics Data in the MaxQB Database

MS-based proteomics generates rapidly increasing amounts of precise and quantitative information. Analysis of individual proteomic experiments has made great strides, but the crucial ability to compare and store information across different proteome measurements still presents many challenges. For e...

Descripción completa

Detalles Bibliográficos
Autores principales: Schaab, Christoph, Geiger, Tamar, Stoehr, Gabriele, Cox, Juergen, Mann, Matthias
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The American Society for Biochemistry and Molecular Biology 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3316731/
https://www.ncbi.nlm.nih.gov/pubmed/22301388
http://dx.doi.org/10.1074/mcp.M111.014068
_version_ 1782228462009843712
author Schaab, Christoph
Geiger, Tamar
Stoehr, Gabriele
Cox, Juergen
Mann, Matthias
author_facet Schaab, Christoph
Geiger, Tamar
Stoehr, Gabriele
Cox, Juergen
Mann, Matthias
author_sort Schaab, Christoph
collection PubMed
description MS-based proteomics generates rapidly increasing amounts of precise and quantitative information. Analysis of individual proteomic experiments has made great strides, but the crucial ability to compare and store information across different proteome measurements still presents many challenges. For example, it has been difficult to avoid contamination of databases with low quality peptide identifications, to control for the inflation in false positive identifications when combining data sets, and to integrate quantitative data. Although, for example, the contamination with low quality identifications has been addressed by joint analysis of deposited raw data in some public repositories, we reasoned that there should be a role for a database specifically designed for high resolution and quantitative data. Here we describe a novel database termed MaxQB that stores and displays collections of large proteomics projects and allows joint analysis and comparison. We demonstrate the analysis tools of MaxQB using proteome data of 11 different human cell lines and 28 mouse tissues. The database-wide false discovery rate is controlled by adjusting the project specific cutoff scores for the combined data sets. The 11 cell line proteomes together identify proteins expressed from more than half of all human genes. For each protein of interest, expression levels estimated by label-free quantification can be visualized across the cell lines. Similarly, the expression rank order and estimated amount of each protein within each proteome are plotted. We used MaxQB to calculate the signal reproducibility of the detected peptides for the same proteins across different proteomes. Spearman rank correlation between peptide intensity and detection probability of identified proteins was greater than 0.8 for 64% of the proteome, whereas a minority of proteins have negative correlation. This information can be used to pinpoint false protein identifications, independently of peptide database scores. The information contained in MaxQB, including high resolution fragment spectra, is accessible to the community via a user-friendly web interface at http://www.biochem.mpg.de/maxqb.
format Online
Article
Text
id pubmed-3316731
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher The American Society for Biochemistry and Molecular Biology
record_format MEDLINE/PubMed
spelling pubmed-33167312012-04-10 Analysis of High Accuracy, Quantitative Proteomics Data in the MaxQB Database Schaab, Christoph Geiger, Tamar Stoehr, Gabriele Cox, Juergen Mann, Matthias Mol Cell Proteomics Special Issue: Prospects in Space and Time MS-based proteomics generates rapidly increasing amounts of precise and quantitative information. Analysis of individual proteomic experiments has made great strides, but the crucial ability to compare and store information across different proteome measurements still presents many challenges. For example, it has been difficult to avoid contamination of databases with low quality peptide identifications, to control for the inflation in false positive identifications when combining data sets, and to integrate quantitative data. Although, for example, the contamination with low quality identifications has been addressed by joint analysis of deposited raw data in some public repositories, we reasoned that there should be a role for a database specifically designed for high resolution and quantitative data. Here we describe a novel database termed MaxQB that stores and displays collections of large proteomics projects and allows joint analysis and comparison. We demonstrate the analysis tools of MaxQB using proteome data of 11 different human cell lines and 28 mouse tissues. The database-wide false discovery rate is controlled by adjusting the project specific cutoff scores for the combined data sets. The 11 cell line proteomes together identify proteins expressed from more than half of all human genes. For each protein of interest, expression levels estimated by label-free quantification can be visualized across the cell lines. Similarly, the expression rank order and estimated amount of each protein within each proteome are plotted. We used MaxQB to calculate the signal reproducibility of the detected peptides for the same proteins across different proteomes. Spearman rank correlation between peptide intensity and detection probability of identified proteins was greater than 0.8 for 64% of the proteome, whereas a minority of proteins have negative correlation. This information can be used to pinpoint false protein identifications, independently of peptide database scores. The information contained in MaxQB, including high resolution fragment spectra, is accessible to the community via a user-friendly web interface at http://www.biochem.mpg.de/maxqb. The American Society for Biochemistry and Molecular Biology 2012-03 2012-02-02 /pmc/articles/PMC3316731/ /pubmed/22301388 http://dx.doi.org/10.1074/mcp.M111.014068 Text en © 2012 by The American Society for Biochemistry and Molecular Biology, Inc. Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/) applies to Author Choice Articles
spellingShingle Special Issue: Prospects in Space and Time
Schaab, Christoph
Geiger, Tamar
Stoehr, Gabriele
Cox, Juergen
Mann, Matthias
Analysis of High Accuracy, Quantitative Proteomics Data in the MaxQB Database
title Analysis of High Accuracy, Quantitative Proteomics Data in the MaxQB Database
title_full Analysis of High Accuracy, Quantitative Proteomics Data in the MaxQB Database
title_fullStr Analysis of High Accuracy, Quantitative Proteomics Data in the MaxQB Database
title_full_unstemmed Analysis of High Accuracy, Quantitative Proteomics Data in the MaxQB Database
title_short Analysis of High Accuracy, Quantitative Proteomics Data in the MaxQB Database
title_sort analysis of high accuracy, quantitative proteomics data in the maxqb database
topic Special Issue: Prospects in Space and Time
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3316731/
https://www.ncbi.nlm.nih.gov/pubmed/22301388
http://dx.doi.org/10.1074/mcp.M111.014068
work_keys_str_mv AT schaabchristoph analysisofhighaccuracyquantitativeproteomicsdatainthemaxqbdatabase
AT geigertamar analysisofhighaccuracyquantitativeproteomicsdatainthemaxqbdatabase
AT stoehrgabriele analysisofhighaccuracyquantitativeproteomicsdatainthemaxqbdatabase
AT coxjuergen analysisofhighaccuracyquantitativeproteomicsdatainthemaxqbdatabase
AT mannmatthias analysisofhighaccuracyquantitativeproteomicsdatainthemaxqbdatabase