Cargando…
Estimating probabilities of peptide database identifications to LC-FTICR-MS observations
BACKGROUND: The field of proteomics involves the characterization of the peptides and proteins expressed in a cell under specific conditions. Proteomics has made rapid advances in recent years following the sequencing of the genomes of an increasing number of organisms. A prominent technology for hi...
Autores principales: | , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2006
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1450261/ https://www.ncbi.nlm.nih.gov/pubmed/16504106 http://dx.doi.org/10.1186/1477-5956-4-1 |
_version_ | 1782127382297051136 |
---|---|
author | Anderson, Kevin K Monroe, Matthew E Daly, Don S |
author_facet | Anderson, Kevin K Monroe, Matthew E Daly, Don S |
author_sort | Anderson, Kevin K |
collection | PubMed |
description | BACKGROUND: The field of proteomics involves the characterization of the peptides and proteins expressed in a cell under specific conditions. Proteomics has made rapid advances in recent years following the sequencing of the genomes of an increasing number of organisms. A prominent technology for high throughput proteomics analysis is the use of liquid chromatography coupled to Fourier transform ion cyclotron resonance mass spectrometry (LC-FTICR-MS). Meaningful biological conclusions can best be made when the peptide identities returned by this technique are accompanied by measures of accuracy and confidence. METHODS: After a tryptically digested protein mixture is analyzed by LC-FTICR-MS, the observed masses and normalized elution times of the detected features are statistically matched to the theoretical masses and elution times of known peptides listed in a large database. The probability of matching is estimated for each peptide in the reference database using statistical classification methods assuming bivariate Gaussian probability distributions on the uncertainties in the masses and the normalized elution times. RESULTS: A database of 69,220 features from 32 LC-FTICR-MS analyses of a tryptically digested bovine serum albumin (BSA) sample was matched to a database populated with 97% false positive peptides. The percentage of high confidence identifications was found to be consistent with other database search procedures. BSA database peptides were identified with high confidence on average in 14.1 of the 32 analyses. False positives were identified on average in just 2.7 analyses. CONCLUSION: Using a priori probabilities that contrast peptides from expected and unexpected proteins was shown to perform better in identifying target peptides than using equally likely a priori probabilities. This is because a large percentage of the target peptides were similar to unexpected peptides which were included to be false positives. The use of triplicate analyses with a "2 out of 3" reporting rule was shown to have excellent rejection of false positives. |
format | Text |
id | pubmed-1450261 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2006 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-14502612006-04-29 Estimating probabilities of peptide database identifications to LC-FTICR-MS observations Anderson, Kevin K Monroe, Matthew E Daly, Don S Proteome Sci Methodology BACKGROUND: The field of proteomics involves the characterization of the peptides and proteins expressed in a cell under specific conditions. Proteomics has made rapid advances in recent years following the sequencing of the genomes of an increasing number of organisms. A prominent technology for high throughput proteomics analysis is the use of liquid chromatography coupled to Fourier transform ion cyclotron resonance mass spectrometry (LC-FTICR-MS). Meaningful biological conclusions can best be made when the peptide identities returned by this technique are accompanied by measures of accuracy and confidence. METHODS: After a tryptically digested protein mixture is analyzed by LC-FTICR-MS, the observed masses and normalized elution times of the detected features are statistically matched to the theoretical masses and elution times of known peptides listed in a large database. The probability of matching is estimated for each peptide in the reference database using statistical classification methods assuming bivariate Gaussian probability distributions on the uncertainties in the masses and the normalized elution times. RESULTS: A database of 69,220 features from 32 LC-FTICR-MS analyses of a tryptically digested bovine serum albumin (BSA) sample was matched to a database populated with 97% false positive peptides. The percentage of high confidence identifications was found to be consistent with other database search procedures. BSA database peptides were identified with high confidence on average in 14.1 of the 32 analyses. False positives were identified on average in just 2.7 analyses. CONCLUSION: Using a priori probabilities that contrast peptides from expected and unexpected proteins was shown to perform better in identifying target peptides than using equally likely a priori probabilities. This is because a large percentage of the target peptides were similar to unexpected peptides which were included to be false positives. The use of triplicate analyses with a "2 out of 3" reporting rule was shown to have excellent rejection of false positives. BioMed Central 2006-02-24 /pmc/articles/PMC1450261/ /pubmed/16504106 http://dx.doi.org/10.1186/1477-5956-4-1 Text en Copyright © 2006 Anderson et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methodology Anderson, Kevin K Monroe, Matthew E Daly, Don S Estimating probabilities of peptide database identifications to LC-FTICR-MS observations |
title | Estimating probabilities of peptide database identifications to LC-FTICR-MS observations |
title_full | Estimating probabilities of peptide database identifications to LC-FTICR-MS observations |
title_fullStr | Estimating probabilities of peptide database identifications to LC-FTICR-MS observations |
title_full_unstemmed | Estimating probabilities of peptide database identifications to LC-FTICR-MS observations |
title_short | Estimating probabilities of peptide database identifications to LC-FTICR-MS observations |
title_sort | estimating probabilities of peptide database identifications to lc-fticr-ms observations |
topic | Methodology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1450261/ https://www.ncbi.nlm.nih.gov/pubmed/16504106 http://dx.doi.org/10.1186/1477-5956-4-1 |
work_keys_str_mv | AT andersonkevink estimatingprobabilitiesofpeptidedatabaseidentificationstolcfticrmsobservations AT monroematthewe estimatingprobabilitiesofpeptidedatabaseidentificationstolcfticrmsobservations AT dalydons estimatingprobabilitiesofpeptidedatabaseidentificationstolcfticrmsobservations |