Cargando…

An adaptive classification model for peptide identification

BACKGROUND: Peptide sequence assignment is the central task in protein identification with MS/MS-based strategies. Although a number of post-database search algorithms for filtering target peptide spectrum matches (PSMs) have been developed, the discrepancy among the output PSMs is usually significa...

Descripción completa

Detalles Bibliográficos
Autores principales: Liang, Xijun, Xia, Zhonghang, Jian, Ling, Niu, Xinnan, Link, Andrew
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4652454/
https://www.ncbi.nlm.nih.gov/pubmed/26578406
http://dx.doi.org/10.1186/1471-2164-16-S11-S1
_version_ 1782401757309042688
author Liang, Xijun
Xia, Zhonghang
Jian, Ling
Niu, Xinnan
Link, Andrew
author_facet Liang, Xijun
Xia, Zhonghang
Jian, Ling
Niu, Xinnan
Link, Andrew
author_sort Liang, Xijun
collection PubMed
description BACKGROUND: Peptide sequence assignment is the central task in protein identification with MS/MS-based strategies. Although a number of post-database search algorithms for filtering target peptide spectrum matches (PSMs) have been developed, the discrepancy among the output PSMs is usually significant, remaining a few disputable PSMs. Current studies show that a number of target PSMs which are close to decoy PSMs can hardly be separated from those decoys by only using the discrimination function. RESULTS: In this paper, we assign each target PSM a weight showing its possibility of being correct. We employ a SVM-based learning model to search the optimal weight for each target PSM and develop a new score system, CRanker, to rank all target PSMs. Due to the large PSM datasets generated in routine database searches, we use the Cholesky factorization technique for storing a kernel matrix to reduce the memory requirement. CONCLUSIONS: Compared with PeptideProphet and Percolator, CRanker has identified more PSMs under similar false discover rates over different datasets. CRanker has shown consistent performance on different test sets, validated the reasonability the proposed model.
format Online
Article
Text
id pubmed-4652454
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-46524542015-11-25 An adaptive classification model for peptide identification Liang, Xijun Xia, Zhonghang Jian, Ling Niu, Xinnan Link, Andrew BMC Genomics Research BACKGROUND: Peptide sequence assignment is the central task in protein identification with MS/MS-based strategies. Although a number of post-database search algorithms for filtering target peptide spectrum matches (PSMs) have been developed, the discrepancy among the output PSMs is usually significant, remaining a few disputable PSMs. Current studies show that a number of target PSMs which are close to decoy PSMs can hardly be separated from those decoys by only using the discrimination function. RESULTS: In this paper, we assign each target PSM a weight showing its possibility of being correct. We employ a SVM-based learning model to search the optimal weight for each target PSM and develop a new score system, CRanker, to rank all target PSMs. Due to the large PSM datasets generated in routine database searches, we use the Cholesky factorization technique for storing a kernel matrix to reduce the memory requirement. CONCLUSIONS: Compared with PeptideProphet and Percolator, CRanker has identified more PSMs under similar false discover rates over different datasets. CRanker has shown consistent performance on different test sets, validated the reasonability the proposed model. BioMed Central 2015-11-10 /pmc/articles/PMC4652454/ /pubmed/26578406 http://dx.doi.org/10.1186/1471-2164-16-S11-S1 Text en Copyright © 2015 Liang et al.; http://creativecommons.org/licenses/by/4.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Liang, Xijun
Xia, Zhonghang
Jian, Ling
Niu, Xinnan
Link, Andrew
An adaptive classification model for peptide identification
title An adaptive classification model for peptide identification
title_full An adaptive classification model for peptide identification
title_fullStr An adaptive classification model for peptide identification
title_full_unstemmed An adaptive classification model for peptide identification
title_short An adaptive classification model for peptide identification
title_sort adaptive classification model for peptide identification
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4652454/
https://www.ncbi.nlm.nih.gov/pubmed/26578406
http://dx.doi.org/10.1186/1471-2164-16-S11-S1
work_keys_str_mv AT liangxijun anadaptiveclassificationmodelforpeptideidentification
AT xiazhonghang anadaptiveclassificationmodelforpeptideidentification
AT jianling anadaptiveclassificationmodelforpeptideidentification
AT niuxinnan anadaptiveclassificationmodelforpeptideidentification
AT linkandrew anadaptiveclassificationmodelforpeptideidentification
AT liangxijun adaptiveclassificationmodelforpeptideidentification
AT xiazhonghang adaptiveclassificationmodelforpeptideidentification
AT jianling adaptiveclassificationmodelforpeptideidentification
AT niuxinnan adaptiveclassificationmodelforpeptideidentification
AT linkandrew adaptiveclassificationmodelforpeptideidentification