Cargando…

Comparative Proteomics Reveals a Significant Bias Toward Alternative Protein Isoforms with Conserved Structure and Function

Advances in high-throughput mass spectrometry are making proteomics an increasingly important tool in genome annotation projects. Peptides detected in mass spectrometry experiments can be used to validate gene models and verify the translation of putative coding sequences (CDSs). Here, we have ident...

Descripción completa

Detalles Bibliográficos
Autores principales: Ezkurdia, Iakes, del Pozo, Angela, Frankish, Adam, Rodriguez, Jose Manuel, Harrow, Jennifer, Ashman, Keith, Valencia, Alfonso, Tress, Michael L.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3424414/
https://www.ncbi.nlm.nih.gov/pubmed/22446687
http://dx.doi.org/10.1093/molbev/mss100
_version_ 1782241209496895488
author Ezkurdia, Iakes
del Pozo, Angela
Frankish, Adam
Rodriguez, Jose Manuel
Harrow, Jennifer
Ashman, Keith
Valencia, Alfonso
Tress, Michael L.
author_facet Ezkurdia, Iakes
del Pozo, Angela
Frankish, Adam
Rodriguez, Jose Manuel
Harrow, Jennifer
Ashman, Keith
Valencia, Alfonso
Tress, Michael L.
author_sort Ezkurdia, Iakes
collection PubMed
description Advances in high-throughput mass spectrometry are making proteomics an increasingly important tool in genome annotation projects. Peptides detected in mass spectrometry experiments can be used to validate gene models and verify the translation of putative coding sequences (CDSs). Here, we have identified peptides that cover 35% of the genes annotated by the GENCODE consortium for the human genome as part of a comprehensive analysis of experimental spectra from two large publicly available mass spectrometry databases. We detected the translation to protein of “novel” and “putative” protein-coding transcripts as well as transcripts annotated as pseudogenes and nonsense-mediated decay targets. We provide a detailed overview of the population of alternatively spliced protein isoforms that are detectable by peptide identification methods. We found that 150 genes expressed multiple alternative protein isoforms. This constitutes the largest set of reliably confirmed alternatively spliced proteins yet discovered. Three groups of genes were highly overrepresented. We detected alternative isoforms for 10 of the 25 possible heterogeneous nuclear ribonucleoproteins, proteins with a key role in the splicing process. Alternative isoforms generated from interchangeable homologous exons and from short indels were also significantly enriched, both in human experiments and in parallel analyses of mouse and Drosophila proteomics experiments. Our results show that a surprisingly high proportion (almost 25%) of the detected alternative isoforms are only subtly different from their constitutive counterparts. Many of the alternative splicing events that give rise to these alternative isoforms are conserved in mouse. It was striking that very few of these conserved splicing events broke Pfam functional domains or would damage globular protein structures. This evidence of a strong bias toward subtle differences in CDS and likely conserved cellular function and structure is remarkable and strongly suggests that the translation of alternative transcripts may be subject to selective constraints.
format Online
Article
Text
id pubmed-3424414
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-34244142012-08-22 Comparative Proteomics Reveals a Significant Bias Toward Alternative Protein Isoforms with Conserved Structure and Function Ezkurdia, Iakes del Pozo, Angela Frankish, Adam Rodriguez, Jose Manuel Harrow, Jennifer Ashman, Keith Valencia, Alfonso Tress, Michael L. Mol Biol Evol Research Articles Advances in high-throughput mass spectrometry are making proteomics an increasingly important tool in genome annotation projects. Peptides detected in mass spectrometry experiments can be used to validate gene models and verify the translation of putative coding sequences (CDSs). Here, we have identified peptides that cover 35% of the genes annotated by the GENCODE consortium for the human genome as part of a comprehensive analysis of experimental spectra from two large publicly available mass spectrometry databases. We detected the translation to protein of “novel” and “putative” protein-coding transcripts as well as transcripts annotated as pseudogenes and nonsense-mediated decay targets. We provide a detailed overview of the population of alternatively spliced protein isoforms that are detectable by peptide identification methods. We found that 150 genes expressed multiple alternative protein isoforms. This constitutes the largest set of reliably confirmed alternatively spliced proteins yet discovered. Three groups of genes were highly overrepresented. We detected alternative isoforms for 10 of the 25 possible heterogeneous nuclear ribonucleoproteins, proteins with a key role in the splicing process. Alternative isoforms generated from interchangeable homologous exons and from short indels were also significantly enriched, both in human experiments and in parallel analyses of mouse and Drosophila proteomics experiments. Our results show that a surprisingly high proportion (almost 25%) of the detected alternative isoforms are only subtly different from their constitutive counterparts. Many of the alternative splicing events that give rise to these alternative isoforms are conserved in mouse. It was striking that very few of these conserved splicing events broke Pfam functional domains or would damage globular protein structures. This evidence of a strong bias toward subtle differences in CDS and likely conserved cellular function and structure is remarkable and strongly suggests that the translation of alternative transcripts may be subject to selective constraints. Oxford University Press 2012-09 2012-03-22 /pmc/articles/PMC3424414/ /pubmed/22446687 http://dx.doi.org/10.1093/molbev/mss100 Text en © The Author(s) 2012. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. http://creativecommons.org/licenses/by-nc/3.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Articles
Ezkurdia, Iakes
del Pozo, Angela
Frankish, Adam
Rodriguez, Jose Manuel
Harrow, Jennifer
Ashman, Keith
Valencia, Alfonso
Tress, Michael L.
Comparative Proteomics Reveals a Significant Bias Toward Alternative Protein Isoforms with Conserved Structure and Function
title Comparative Proteomics Reveals a Significant Bias Toward Alternative Protein Isoforms with Conserved Structure and Function
title_full Comparative Proteomics Reveals a Significant Bias Toward Alternative Protein Isoforms with Conserved Structure and Function
title_fullStr Comparative Proteomics Reveals a Significant Bias Toward Alternative Protein Isoforms with Conserved Structure and Function
title_full_unstemmed Comparative Proteomics Reveals a Significant Bias Toward Alternative Protein Isoforms with Conserved Structure and Function
title_short Comparative Proteomics Reveals a Significant Bias Toward Alternative Protein Isoforms with Conserved Structure and Function
title_sort comparative proteomics reveals a significant bias toward alternative protein isoforms with conserved structure and function
topic Research Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3424414/
https://www.ncbi.nlm.nih.gov/pubmed/22446687
http://dx.doi.org/10.1093/molbev/mss100
work_keys_str_mv AT ezkurdiaiakes comparativeproteomicsrevealsasignificantbiastowardalternativeproteinisoformswithconservedstructureandfunction
AT delpozoangela comparativeproteomicsrevealsasignificantbiastowardalternativeproteinisoformswithconservedstructureandfunction
AT frankishadam comparativeproteomicsrevealsasignificantbiastowardalternativeproteinisoformswithconservedstructureandfunction
AT rodriguezjosemanuel comparativeproteomicsrevealsasignificantbiastowardalternativeproteinisoformswithconservedstructureandfunction
AT harrowjennifer comparativeproteomicsrevealsasignificantbiastowardalternativeproteinisoformswithconservedstructureandfunction
AT ashmankeith comparativeproteomicsrevealsasignificantbiastowardalternativeproteinisoformswithconservedstructureandfunction
AT valenciaalfonso comparativeproteomicsrevealsasignificantbiastowardalternativeproteinisoformswithconservedstructureandfunction
AT tressmichaell comparativeproteomicsrevealsasignificantbiastowardalternativeproteinisoformswithconservedstructureandfunction