Cargando…

Mass spectrometry-based protein identification by integrating de novo sequencing with database searching

BACKGROUND: Mass spectrometry-based protein identification is a very challenging task. The main identification approaches include de novo sequencing and database searching. Both approaches have shortcomings, so an integrative approach has been developed. The integrative approach firstly infers parti...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Penghao, Wilson, Susan R
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3549845/
https://www.ncbi.nlm.nih.gov/pubmed/23369017
http://dx.doi.org/10.1186/1471-2105-14-S2-S24
_version_ 1782256483430301696
author Wang, Penghao
Wilson, Susan R
author_facet Wang, Penghao
Wilson, Susan R
author_sort Wang, Penghao
collection PubMed
description BACKGROUND: Mass spectrometry-based protein identification is a very challenging task. The main identification approaches include de novo sequencing and database searching. Both approaches have shortcomings, so an integrative approach has been developed. The integrative approach firstly infers partial peptide sequences, known as tags, directly from tandem spectra through de novo sequencing, and then puts these sequences into a database search to see if a close peptide match can be found. However the current implementation of this integrative approach has several limitations. Firstly, simplistic de novo sequencing is applied and only very short sequence tags are used. Secondly, most integrative methods apply an algorithm similar to BLAST to search for exact sequence matches and do not accommodate sequence errors well. Thirdly, by applying these methods the integrated de novo sequencing makes a limited contribution to the scoring model which is still largely based on database searching. RESULTS: We have developed a new integrative protein identification method which can integrate de novo sequencing more efficiently into database searching. Evaluated on large real datasets, our method outperforms popular identification methods.
format Online
Article
Text
id pubmed-3549845
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-35498452013-01-23 Mass spectrometry-based protein identification by integrating de novo sequencing with database searching Wang, Penghao Wilson, Susan R BMC Bioinformatics Proceedings BACKGROUND: Mass spectrometry-based protein identification is a very challenging task. The main identification approaches include de novo sequencing and database searching. Both approaches have shortcomings, so an integrative approach has been developed. The integrative approach firstly infers partial peptide sequences, known as tags, directly from tandem spectra through de novo sequencing, and then puts these sequences into a database search to see if a close peptide match can be found. However the current implementation of this integrative approach has several limitations. Firstly, simplistic de novo sequencing is applied and only very short sequence tags are used. Secondly, most integrative methods apply an algorithm similar to BLAST to search for exact sequence matches and do not accommodate sequence errors well. Thirdly, by applying these methods the integrated de novo sequencing makes a limited contribution to the scoring model which is still largely based on database searching. RESULTS: We have developed a new integrative protein identification method which can integrate de novo sequencing more efficiently into database searching. Evaluated on large real datasets, our method outperforms popular identification methods. BioMed Central 2013-01-21 /pmc/articles/PMC3549845/ /pubmed/23369017 http://dx.doi.org/10.1186/1471-2105-14-S2-S24 Text en Copyright ©2013 Wang and Wilson; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Proceedings
Wang, Penghao
Wilson, Susan R
Mass spectrometry-based protein identification by integrating de novo sequencing with database searching
title Mass spectrometry-based protein identification by integrating de novo sequencing with database searching
title_full Mass spectrometry-based protein identification by integrating de novo sequencing with database searching
title_fullStr Mass spectrometry-based protein identification by integrating de novo sequencing with database searching
title_full_unstemmed Mass spectrometry-based protein identification by integrating de novo sequencing with database searching
title_short Mass spectrometry-based protein identification by integrating de novo sequencing with database searching
title_sort mass spectrometry-based protein identification by integrating de novo sequencing with database searching
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3549845/
https://www.ncbi.nlm.nih.gov/pubmed/23369017
http://dx.doi.org/10.1186/1471-2105-14-S2-S24
work_keys_str_mv AT wangpenghao massspectrometrybasedproteinidentificationbyintegratingdenovosequencingwithdatabasesearching
AT wilsonsusanr massspectrometrybasedproteinidentificationbyintegratingdenovosequencingwithdatabasesearching