Cargando…

RAId_DbS: Peptide Identification using Database Searches with Realistic Statistics

BACKGROUND: The key to mass-spectrometry-based proteomics is peptide identification. A major challenge in peptide identification is to obtain realistic E-values when assigning statistical significance to candidate peptides. RESULTS: Using a simple scoring scheme, we propose a database search method...

Descripción completa

Detalles Bibliográficos
Autores principales: Alves, Gelio, Ogurtsov, Aleksey Y, Yu, Yi-Kuo
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2211744/
https://www.ncbi.nlm.nih.gov/pubmed/17961253
http://dx.doi.org/10.1186/1745-6150-2-25
_version_ 1782148544827752448
author Alves, Gelio
Ogurtsov, Aleksey Y
Yu, Yi-Kuo
author_facet Alves, Gelio
Ogurtsov, Aleksey Y
Yu, Yi-Kuo
author_sort Alves, Gelio
collection PubMed
description BACKGROUND: The key to mass-spectrometry-based proteomics is peptide identification. A major challenge in peptide identification is to obtain realistic E-values when assigning statistical significance to candidate peptides. RESULTS: Using a simple scoring scheme, we propose a database search method with theoretically characterized statistics. Taking into account possible skewness in the random variable distribution and the effect of finite sampling, we provide a theoretical derivation for the tail of the score distribution. For every experimental spectrum examined, we collect the scores of peptides in the database, and find good agreement between the collected score statistics and our theoretical distribution. Using Student's t-tests, we quantify the degree of agreement between the theoretical distribution and the score statistics collected. The T-tests may be used to measure the reliability of reported statistics. When combined with reported P-value for a peptide hit using a score distribution model, this new measure prevents exaggerated statistics. Another feature of RAId_DbS is its capability of detecting multiple co-eluted peptides. The peptide identification performance and statistical accuracy of RAId_DbS are assessed and compared with several other search tools. The executables and data related to RAId_DbS are freely available upon request.
format Text
id pubmed-2211744
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-22117442008-01-23 RAId_DbS: Peptide Identification using Database Searches with Realistic Statistics Alves, Gelio Ogurtsov, Aleksey Y Yu, Yi-Kuo Biol Direct Research BACKGROUND: The key to mass-spectrometry-based proteomics is peptide identification. A major challenge in peptide identification is to obtain realistic E-values when assigning statistical significance to candidate peptides. RESULTS: Using a simple scoring scheme, we propose a database search method with theoretically characterized statistics. Taking into account possible skewness in the random variable distribution and the effect of finite sampling, we provide a theoretical derivation for the tail of the score distribution. For every experimental spectrum examined, we collect the scores of peptides in the database, and find good agreement between the collected score statistics and our theoretical distribution. Using Student's t-tests, we quantify the degree of agreement between the theoretical distribution and the score statistics collected. The T-tests may be used to measure the reliability of reported statistics. When combined with reported P-value for a peptide hit using a score distribution model, this new measure prevents exaggerated statistics. Another feature of RAId_DbS is its capability of detecting multiple co-eluted peptides. The peptide identification performance and statistical accuracy of RAId_DbS are assessed and compared with several other search tools. The executables and data related to RAId_DbS are freely available upon request. BioMed Central 2007-10-25 /pmc/articles/PMC2211744/ /pubmed/17961253 http://dx.doi.org/10.1186/1745-6150-2-25 Text en Copyright © 2007 Alves et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Alves, Gelio
Ogurtsov, Aleksey Y
Yu, Yi-Kuo
RAId_DbS: Peptide Identification using Database Searches with Realistic Statistics
title RAId_DbS: Peptide Identification using Database Searches with Realistic Statistics
title_full RAId_DbS: Peptide Identification using Database Searches with Realistic Statistics
title_fullStr RAId_DbS: Peptide Identification using Database Searches with Realistic Statistics
title_full_unstemmed RAId_DbS: Peptide Identification using Database Searches with Realistic Statistics
title_short RAId_DbS: Peptide Identification using Database Searches with Realistic Statistics
title_sort raid_dbs: peptide identification using database searches with realistic statistics
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2211744/
https://www.ncbi.nlm.nih.gov/pubmed/17961253
http://dx.doi.org/10.1186/1745-6150-2-25
work_keys_str_mv AT alvesgelio raiddbspeptideidentificationusingdatabasesearcheswithrealisticstatistics
AT ogurtsovalekseyy raiddbspeptideidentificationusingdatabasesearcheswithrealisticstatistics
AT yuyikuo raiddbspeptideidentificationusingdatabasesearcheswithrealisticstatistics