Cargando…

Multi-species Identification of Polymorphic Peptide Variants via Propagation in Spectral Networks

Peptide and protein identification remains challenging in organisms with poorly annotated or rapidly evolving genomes, as are commonly encountered in environmental or biofuels research. Such limitations render tandem mass spectrometry (MS/MS) database search algorithms ineffective as they lack corre...

Descripción completa

Detalles Bibliográficos
Autores principales: Na, Seungjin, Payne, Samuel H., Bandeira, Nuno
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The American Society for Biochemistry and Molecular Biology 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5098046/
https://www.ncbi.nlm.nih.gov/pubmed/27609420
http://dx.doi.org/10.1074/mcp.O116.060913
_version_ 1782465704351498240
author Na, Seungjin
Payne, Samuel H.
Bandeira, Nuno
author_facet Na, Seungjin
Payne, Samuel H.
Bandeira, Nuno
author_sort Na, Seungjin
collection PubMed
description Peptide and protein identification remains challenging in organisms with poorly annotated or rapidly evolving genomes, as are commonly encountered in environmental or biofuels research. Such limitations render tandem mass spectrometry (MS/MS) database search algorithms ineffective as they lack corresponding sequences required for peptide-spectrum matching. We address this challenge with the spectral networks approach to (1) match spectra of orthologous peptides across multiple related species and then (2) propagate peptide annotations from identified to unidentified spectra. We here present algorithms to assess the statistical significance of spectral alignments (Align-GF), reduce the impurity in spectral networks, and accurately estimate the error rate in propagated identifications. Analyzing three related Cyanothece species, a model organism for biohydrogen production, spectral networks identified peptides from highly divergent sequences from networks with dozens of variant peptides, including thousands of peptides in species lacking a sequenced genome. Our analysis further detected the presence of many novel putative peptides even in genomically characterized species, thus suggesting the possibility of gaps in our understanding of their proteomic and genomic expression. A web-based pipeline for spectral networks analysis is available at http://proteomics.ucsd.edu/software.
format Online
Article
Text
id pubmed-5098046
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher The American Society for Biochemistry and Molecular Biology
record_format MEDLINE/PubMed
spelling pubmed-50980462016-11-17 Multi-species Identification of Polymorphic Peptide Variants via Propagation in Spectral Networks Na, Seungjin Payne, Samuel H. Bandeira, Nuno Mol Cell Proteomics Technological Innovation and Resources Peptide and protein identification remains challenging in organisms with poorly annotated or rapidly evolving genomes, as are commonly encountered in environmental or biofuels research. Such limitations render tandem mass spectrometry (MS/MS) database search algorithms ineffective as they lack corresponding sequences required for peptide-spectrum matching. We address this challenge with the spectral networks approach to (1) match spectra of orthologous peptides across multiple related species and then (2) propagate peptide annotations from identified to unidentified spectra. We here present algorithms to assess the statistical significance of spectral alignments (Align-GF), reduce the impurity in spectral networks, and accurately estimate the error rate in propagated identifications. Analyzing three related Cyanothece species, a model organism for biohydrogen production, spectral networks identified peptides from highly divergent sequences from networks with dozens of variant peptides, including thousands of peptides in species lacking a sequenced genome. Our analysis further detected the presence of many novel putative peptides even in genomically characterized species, thus suggesting the possibility of gaps in our understanding of their proteomic and genomic expression. A web-based pipeline for spectral networks analysis is available at http://proteomics.ucsd.edu/software. The American Society for Biochemistry and Molecular Biology 2016-11 2016-09-08 /pmc/articles/PMC5098046/ /pubmed/27609420 http://dx.doi.org/10.1074/mcp.O116.060913 Text en © 2016 by The American Society for Biochemistry and Molecular Biology, Inc. Author's Choice—Final version free via Creative Commons CC-BY license (http://creativecommons.org/licenses/by/4.0) .
spellingShingle Technological Innovation and Resources
Na, Seungjin
Payne, Samuel H.
Bandeira, Nuno
Multi-species Identification of Polymorphic Peptide Variants via Propagation in Spectral Networks
title Multi-species Identification of Polymorphic Peptide Variants via Propagation in Spectral Networks
title_full Multi-species Identification of Polymorphic Peptide Variants via Propagation in Spectral Networks
title_fullStr Multi-species Identification of Polymorphic Peptide Variants via Propagation in Spectral Networks
title_full_unstemmed Multi-species Identification of Polymorphic Peptide Variants via Propagation in Spectral Networks
title_short Multi-species Identification of Polymorphic Peptide Variants via Propagation in Spectral Networks
title_sort multi-species identification of polymorphic peptide variants via propagation in spectral networks
topic Technological Innovation and Resources
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5098046/
https://www.ncbi.nlm.nih.gov/pubmed/27609420
http://dx.doi.org/10.1074/mcp.O116.060913
work_keys_str_mv AT naseungjin multispeciesidentificationofpolymorphicpeptidevariantsviapropagationinspectralnetworks
AT paynesamuelh multispeciesidentificationofpolymorphicpeptidevariantsviapropagationinspectralnetworks
AT bandeiranuno multispeciesidentificationofpolymorphicpeptidevariantsviapropagationinspectralnetworks