Cargando…

Novel peptide identification from tandem mass spectra using ESTs and sequence database compression

Peptide identification by tandem mass spectrometry is the dominant proteomics workflow for protein characterization in complex samples. Traditional search engines, which match peptide sequences with tandem mass spectra to identify the samples' proteins, use protein sequence databases to suggest...

Descripción completa

Detalles Bibliográficos
Autor principal:	Edwards, Nathan J
Formato:	Texto
Lenguaje:	English
Publicado:	Nature Publishing Group 2007
Materias:	Report
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1865584/ https://www.ncbi.nlm.nih.gov/pubmed/17437027 http://dx.doi.org/10.1038/msb4100142

_version_	1782133243196211200
author	Edwards, Nathan J
author_facet	Edwards, Nathan J
author_sort	Edwards, Nathan J
collection	PubMed
description	Peptide identification by tandem mass spectrometry is the dominant proteomics workflow for protein characterization in complex samples. Traditional search engines, which match peptide sequences with tandem mass spectra to identify the samples' proteins, use protein sequence databases to suggest peptide candidates for consideration. Although the acquisition of tandem mass spectra is not biased toward well-understood protein isoforms, this computational strategy is failing to identify peptides from alternative splicing and coding SNP protein isoforms despite the acquisition of good-quality tandem mass spectra. We propose, instead, that expressed sequence tags (ESTs) be searched. Ordinarily, such a strategy would be computationally infeasible due to the size of EST sequence databases; however, we show that a sophisticated sequence database compression strategy, applied to human ESTs, reduces the sequence database size approximately 35-fold. Once compressed, our EST sequence database is comparable in size to other commonly used protein sequence databases, making routine EST searching feasible. We demonstrate that our EST sequence database enables the discovery of novel peptides in a variety of public data sets.
format	Text
id	pubmed-1865584
institution	National Center for Biotechnology Information
language	English
publishDate	2007
publisher	Nature Publishing Group
record_format	MEDLINE/PubMed
spelling	pubmed-18655842007-05-07 Novel peptide identification from tandem mass spectra using ESTs and sequence database compression Edwards, Nathan J Mol Syst Biol Report Peptide identification by tandem mass spectrometry is the dominant proteomics workflow for protein characterization in complex samples. Traditional search engines, which match peptide sequences with tandem mass spectra to identify the samples' proteins, use protein sequence databases to suggest peptide candidates for consideration. Although the acquisition of tandem mass spectra is not biased toward well-understood protein isoforms, this computational strategy is failing to identify peptides from alternative splicing and coding SNP protein isoforms despite the acquisition of good-quality tandem mass spectra. We propose, instead, that expressed sequence tags (ESTs) be searched. Ordinarily, such a strategy would be computationally infeasible due to the size of EST sequence databases; however, we show that a sophisticated sequence database compression strategy, applied to human ESTs, reduces the sequence database size approximately 35-fold. Once compressed, our EST sequence database is comparable in size to other commonly used protein sequence databases, making routine EST searching feasible. We demonstrate that our EST sequence database enables the discovery of novel peptides in a variety of public data sets. Nature Publishing Group 2007-04-17 /pmc/articles/PMC1865584/ /pubmed/17437027 http://dx.doi.org/10.1038/msb4100142 Text en Copyright © 2007, EMBO and Nature Publishing Group
spellingShingle	Report Edwards, Nathan J Novel peptide identification from tandem mass spectra using ESTs and sequence database compression
title	Novel peptide identification from tandem mass spectra using ESTs and sequence database compression
title_full	Novel peptide identification from tandem mass spectra using ESTs and sequence database compression
title_fullStr	Novel peptide identification from tandem mass spectra using ESTs and sequence database compression
title_full_unstemmed	Novel peptide identification from tandem mass spectra using ESTs and sequence database compression
title_short	Novel peptide identification from tandem mass spectra using ESTs and sequence database compression
title_sort	novel peptide identification from tandem mass spectra using ests and sequence database compression
topic	Report
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1865584/ https://www.ncbi.nlm.nih.gov/pubmed/17437027 http://dx.doi.org/10.1038/msb4100142
work_keys_str_mv	AT edwardsnathanj novelpeptideidentificationfromtandemmassspectrausingestsandsequencedatabasecompression

Novel peptide identification from tandem mass spectra using ESTs and sequence database compression

Ejemplares similares