Cargando…

Comparison of phosphorylation patterns across eukaryotes by discriminative N-gram analysis

BACKGROUND: How protein phosphorylation relates to kingdom/phylum divergence is largely unknown and the amino acid residues surrounding the phosphorylation site have profound importance on protein kinase–substrate interactions. Standard motif analysis is not adequate for large scale comparative anal...

Descripción completa

Detalles Bibliográficos
Autores principales: Frades, Itziar, Resjö, Svante, Andreasson, Erik
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4520095/
https://www.ncbi.nlm.nih.gov/pubmed/26224486
http://dx.doi.org/10.1186/s12859-015-0657-2
_version_ 1782383614986551296
author Frades, Itziar
Resjö, Svante
Andreasson, Erik
author_facet Frades, Itziar
Resjö, Svante
Andreasson, Erik
author_sort Frades, Itziar
collection PubMed
description BACKGROUND: How protein phosphorylation relates to kingdom/phylum divergence is largely unknown and the amino acid residues surrounding the phosphorylation site have profound importance on protein kinase–substrate interactions. Standard motif analysis is not adequate for large scale comparative analysis because each phophopeptide is assigned to a unique motif and perform poorly with the unbalanced nature of the input datasets. RESULTS: First the discriminative n-grams of five species from five different kingdom/phyla were identified. A signature with 5540 discriminative n-grams that could be found in other species from the same kingdoms/phyla was created. Using a test data set, the ability of the signature to classify species in their corresponding kingdom/phylum was confirmed using classification methods. Lastly, ortholog proteins among proteins with n-grams were identified in order to determine to what degree was the identity of the detected n-grams a property of phosphosites rather than a consequence of species-specific or kingdom/phylum-specific protein inventory. The motifs were grouped in clusters of equal physico-chemical nature and their distribution was similar between species in the same kingdom/phylum while clear differences were found among species of different kingdom/phylum. For example, the animal-specific top discriminative n-grams contained many basic amino acids and the plant-specific motifs were mainly acidic. Secondary structure prediction methods show that the discriminative n-grams in the majority of the cases lack from a regular secondary structure as on average they had 88 % of random coil compared to 66 % found in the phosphoproteins they were derived from. CONCLUSIONS: The discriminative n-grams were able to classify organisms in their corresponding kingdom/phylum, they show different patterns among species of different kingdom/phylum and these regions can contribute to evolutionary divergence as they are in disordered regions that can evolve rapidly. The differences found possibly reflect group-specific differences in the kinomes of the different groups of species. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0657-2) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4520095
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-45200952015-07-31 Comparison of phosphorylation patterns across eukaryotes by discriminative N-gram analysis Frades, Itziar Resjö, Svante Andreasson, Erik BMC Bioinformatics Research Article BACKGROUND: How protein phosphorylation relates to kingdom/phylum divergence is largely unknown and the amino acid residues surrounding the phosphorylation site have profound importance on protein kinase–substrate interactions. Standard motif analysis is not adequate for large scale comparative analysis because each phophopeptide is assigned to a unique motif and perform poorly with the unbalanced nature of the input datasets. RESULTS: First the discriminative n-grams of five species from five different kingdom/phyla were identified. A signature with 5540 discriminative n-grams that could be found in other species from the same kingdoms/phyla was created. Using a test data set, the ability of the signature to classify species in their corresponding kingdom/phylum was confirmed using classification methods. Lastly, ortholog proteins among proteins with n-grams were identified in order to determine to what degree was the identity of the detected n-grams a property of phosphosites rather than a consequence of species-specific or kingdom/phylum-specific protein inventory. The motifs were grouped in clusters of equal physico-chemical nature and their distribution was similar between species in the same kingdom/phylum while clear differences were found among species of different kingdom/phylum. For example, the animal-specific top discriminative n-grams contained many basic amino acids and the plant-specific motifs were mainly acidic. Secondary structure prediction methods show that the discriminative n-grams in the majority of the cases lack from a regular secondary structure as on average they had 88 % of random coil compared to 66 % found in the phosphoproteins they were derived from. CONCLUSIONS: The discriminative n-grams were able to classify organisms in their corresponding kingdom/phylum, they show different patterns among species of different kingdom/phylum and these regions can contribute to evolutionary divergence as they are in disordered regions that can evolve rapidly. The differences found possibly reflect group-specific differences in the kinomes of the different groups of species. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0657-2) contains supplementary material, which is available to authorized users. BioMed Central 2015-07-30 /pmc/articles/PMC4520095/ /pubmed/26224486 http://dx.doi.org/10.1186/s12859-015-0657-2 Text en © Frades et al. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Frades, Itziar
Resjö, Svante
Andreasson, Erik
Comparison of phosphorylation patterns across eukaryotes by discriminative N-gram analysis
title Comparison of phosphorylation patterns across eukaryotes by discriminative N-gram analysis
title_full Comparison of phosphorylation patterns across eukaryotes by discriminative N-gram analysis
title_fullStr Comparison of phosphorylation patterns across eukaryotes by discriminative N-gram analysis
title_full_unstemmed Comparison of phosphorylation patterns across eukaryotes by discriminative N-gram analysis
title_short Comparison of phosphorylation patterns across eukaryotes by discriminative N-gram analysis
title_sort comparison of phosphorylation patterns across eukaryotes by discriminative n-gram analysis
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4520095/
https://www.ncbi.nlm.nih.gov/pubmed/26224486
http://dx.doi.org/10.1186/s12859-015-0657-2
work_keys_str_mv AT fradesitziar comparisonofphosphorylationpatternsacrosseukaryotesbydiscriminativengramanalysis
AT resjosvante comparisonofphosphorylationpatternsacrosseukaryotesbydiscriminativengramanalysis
AT andreassonerik comparisonofphosphorylationpatternsacrosseukaryotesbydiscriminativengramanalysis