Cargando…
Multi-species Identification of Polymorphic Peptide Variants via Propagation in Spectral Networks
Peptide and protein identification remains challenging in organisms with poorly annotated or rapidly evolving genomes, as are commonly encountered in environmental or biofuels research. Such limitations render tandem mass spectrometry (MS/MS) database search algorithms ineffective as they lack corre...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
The American Society for Biochemistry and Molecular Biology
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5098046/ https://www.ncbi.nlm.nih.gov/pubmed/27609420 http://dx.doi.org/10.1074/mcp.O116.060913 |
_version_ | 1782465704351498240 |
---|---|
author | Na, Seungjin Payne, Samuel H. Bandeira, Nuno |
author_facet | Na, Seungjin Payne, Samuel H. Bandeira, Nuno |
author_sort | Na, Seungjin |
collection | PubMed |
description | Peptide and protein identification remains challenging in organisms with poorly annotated or rapidly evolving genomes, as are commonly encountered in environmental or biofuels research. Such limitations render tandem mass spectrometry (MS/MS) database search algorithms ineffective as they lack corresponding sequences required for peptide-spectrum matching. We address this challenge with the spectral networks approach to (1) match spectra of orthologous peptides across multiple related species and then (2) propagate peptide annotations from identified to unidentified spectra. We here present algorithms to assess the statistical significance of spectral alignments (Align-GF), reduce the impurity in spectral networks, and accurately estimate the error rate in propagated identifications. Analyzing three related Cyanothece species, a model organism for biohydrogen production, spectral networks identified peptides from highly divergent sequences from networks with dozens of variant peptides, including thousands of peptides in species lacking a sequenced genome. Our analysis further detected the presence of many novel putative peptides even in genomically characterized species, thus suggesting the possibility of gaps in our understanding of their proteomic and genomic expression. A web-based pipeline for spectral networks analysis is available at http://proteomics.ucsd.edu/software. |
format | Online Article Text |
id | pubmed-5098046 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | The American Society for Biochemistry and Molecular Biology |
record_format | MEDLINE/PubMed |
spelling | pubmed-50980462016-11-17 Multi-species Identification of Polymorphic Peptide Variants via Propagation in Spectral Networks Na, Seungjin Payne, Samuel H. Bandeira, Nuno Mol Cell Proteomics Technological Innovation and Resources Peptide and protein identification remains challenging in organisms with poorly annotated or rapidly evolving genomes, as are commonly encountered in environmental or biofuels research. Such limitations render tandem mass spectrometry (MS/MS) database search algorithms ineffective as they lack corresponding sequences required for peptide-spectrum matching. We address this challenge with the spectral networks approach to (1) match spectra of orthologous peptides across multiple related species and then (2) propagate peptide annotations from identified to unidentified spectra. We here present algorithms to assess the statistical significance of spectral alignments (Align-GF), reduce the impurity in spectral networks, and accurately estimate the error rate in propagated identifications. Analyzing three related Cyanothece species, a model organism for biohydrogen production, spectral networks identified peptides from highly divergent sequences from networks with dozens of variant peptides, including thousands of peptides in species lacking a sequenced genome. Our analysis further detected the presence of many novel putative peptides even in genomically characterized species, thus suggesting the possibility of gaps in our understanding of their proteomic and genomic expression. A web-based pipeline for spectral networks analysis is available at http://proteomics.ucsd.edu/software. The American Society for Biochemistry and Molecular Biology 2016-11 2016-09-08 /pmc/articles/PMC5098046/ /pubmed/27609420 http://dx.doi.org/10.1074/mcp.O116.060913 Text en © 2016 by The American Society for Biochemistry and Molecular Biology, Inc. Author's Choice—Final version free via Creative Commons CC-BY license (http://creativecommons.org/licenses/by/4.0) . |
spellingShingle | Technological Innovation and Resources Na, Seungjin Payne, Samuel H. Bandeira, Nuno Multi-species Identification of Polymorphic Peptide Variants via Propagation in Spectral Networks |
title | Multi-species Identification of Polymorphic Peptide Variants via Propagation in Spectral Networks |
title_full | Multi-species Identification of Polymorphic Peptide Variants via Propagation in Spectral Networks |
title_fullStr | Multi-species Identification of Polymorphic Peptide Variants via Propagation in Spectral Networks |
title_full_unstemmed | Multi-species Identification of Polymorphic Peptide Variants via Propagation in Spectral Networks |
title_short | Multi-species Identification of Polymorphic Peptide Variants via Propagation in Spectral Networks |
title_sort | multi-species identification of polymorphic peptide variants via propagation in spectral networks |
topic | Technological Innovation and Resources |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5098046/ https://www.ncbi.nlm.nih.gov/pubmed/27609420 http://dx.doi.org/10.1074/mcp.O116.060913 |
work_keys_str_mv | AT naseungjin multispeciesidentificationofpolymorphicpeptidevariantsviapropagationinspectralnetworks AT paynesamuelh multispeciesidentificationofpolymorphicpeptidevariantsviapropagationinspectralnetworks AT bandeiranuno multispeciesidentificationofpolymorphicpeptidevariantsviapropagationinspectralnetworks |