Cargando…

Detection of discriminative sequence patterns in the neighborhood of proline cis peptide bonds and their functional annotation

BACKGROUND: Polypeptides are composed of amino acids covalently bonded via a peptide bond. The majority of peptide bonds in proteins is found to occur in the trans conformation. In spite of their infrequent occurrence, cis peptide bonds play a key role in the protein structure and function, as well...

Descripción completa

Detalles Bibliográficos
Autores principales: Exarchos, Konstantinos P, Exarchos, Themis P, Papaloukas, Costas, Troganis, Anastassios N, Fotiadis, Dimitrios I
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2678097/
https://www.ncbi.nlm.nih.gov/pubmed/19379512
http://dx.doi.org/10.1186/1471-2105-10-113
_version_ 1782166816918863872
author Exarchos, Konstantinos P
Exarchos, Themis P
Papaloukas, Costas
Troganis, Anastassios N
Fotiadis, Dimitrios I
author_facet Exarchos, Konstantinos P
Exarchos, Themis P
Papaloukas, Costas
Troganis, Anastassios N
Fotiadis, Dimitrios I
author_sort Exarchos, Konstantinos P
collection PubMed
description BACKGROUND: Polypeptides are composed of amino acids covalently bonded via a peptide bond. The majority of peptide bonds in proteins is found to occur in the trans conformation. In spite of their infrequent occurrence, cis peptide bonds play a key role in the protein structure and function, as well as in many significant biological processes. RESULTS: We perform a systematic analysis of regions in protein sequences that contain a proline cis peptide bond in order to discover non-random associations between the primary sequence and the nature of proline cis/trans isomerization. For this purpose an efficient pattern discovery algorithm is employed which discovers regular expression-type patterns that are overrepresented (i.e. appear frequently repeated) in a set of sequences. Four types of pattern discovery are performed: i) exact pattern discovery, ii) pattern discovery using a chemical equivalency set, iii) pattern discovery using a structural equivalency set and iv) pattern discovery using certain amino acids' physicochemical properties. The extracted patterns are carefully validated using a specially implemented scoring function and a significance measure (i.e. log-probability estimate) indicative of their specificity. The score threshold for the first three types of pattern discovery is 0.90 while for the last type of pattern discovery 0.80. Regarding the significance measure, all patterns yielded values in the range [-9, -31] which ensure that the derived patterns are highly unlikely to have emerged by chance. Among the highest scoring patterns, most of them are consistent with previous investigations concerning the neighborhood of cis proline peptide bonds, and many new ones are identified. Finally, the extracted patterns are systematically compared against the PROSITE database, in order to gain insight into the functional implications of cis prolyl bonds. CONCLUSION: Cis patterns with matches in the PROSITE database fell mostly into two main functional clusters: family signatures and protein signatures. However considerable propensity was also observed for targeting signals, active and phosphorylation sites as well as domain signatures.
format Text
id pubmed-2678097
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-26780972009-05-07 Detection of discriminative sequence patterns in the neighborhood of proline cis peptide bonds and their functional annotation Exarchos, Konstantinos P Exarchos, Themis P Papaloukas, Costas Troganis, Anastassios N Fotiadis, Dimitrios I BMC Bioinformatics Research Article BACKGROUND: Polypeptides are composed of amino acids covalently bonded via a peptide bond. The majority of peptide bonds in proteins is found to occur in the trans conformation. In spite of their infrequent occurrence, cis peptide bonds play a key role in the protein structure and function, as well as in many significant biological processes. RESULTS: We perform a systematic analysis of regions in protein sequences that contain a proline cis peptide bond in order to discover non-random associations between the primary sequence and the nature of proline cis/trans isomerization. For this purpose an efficient pattern discovery algorithm is employed which discovers regular expression-type patterns that are overrepresented (i.e. appear frequently repeated) in a set of sequences. Four types of pattern discovery are performed: i) exact pattern discovery, ii) pattern discovery using a chemical equivalency set, iii) pattern discovery using a structural equivalency set and iv) pattern discovery using certain amino acids' physicochemical properties. The extracted patterns are carefully validated using a specially implemented scoring function and a significance measure (i.e. log-probability estimate) indicative of their specificity. The score threshold for the first three types of pattern discovery is 0.90 while for the last type of pattern discovery 0.80. Regarding the significance measure, all patterns yielded values in the range [-9, -31] which ensure that the derived patterns are highly unlikely to have emerged by chance. Among the highest scoring patterns, most of them are consistent with previous investigations concerning the neighborhood of cis proline peptide bonds, and many new ones are identified. Finally, the extracted patterns are systematically compared against the PROSITE database, in order to gain insight into the functional implications of cis prolyl bonds. CONCLUSION: Cis patterns with matches in the PROSITE database fell mostly into two main functional clusters: family signatures and protein signatures. However considerable propensity was also observed for targeting signals, active and phosphorylation sites as well as domain signatures. BioMed Central 2009-04-20 /pmc/articles/PMC2678097/ /pubmed/19379512 http://dx.doi.org/10.1186/1471-2105-10-113 Text en Copyright © 2009 Exarchos et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Exarchos, Konstantinos P
Exarchos, Themis P
Papaloukas, Costas
Troganis, Anastassios N
Fotiadis, Dimitrios I
Detection of discriminative sequence patterns in the neighborhood of proline cis peptide bonds and their functional annotation
title Detection of discriminative sequence patterns in the neighborhood of proline cis peptide bonds and their functional annotation
title_full Detection of discriminative sequence patterns in the neighborhood of proline cis peptide bonds and their functional annotation
title_fullStr Detection of discriminative sequence patterns in the neighborhood of proline cis peptide bonds and their functional annotation
title_full_unstemmed Detection of discriminative sequence patterns in the neighborhood of proline cis peptide bonds and their functional annotation
title_short Detection of discriminative sequence patterns in the neighborhood of proline cis peptide bonds and their functional annotation
title_sort detection of discriminative sequence patterns in the neighborhood of proline cis peptide bonds and their functional annotation
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2678097/
https://www.ncbi.nlm.nih.gov/pubmed/19379512
http://dx.doi.org/10.1186/1471-2105-10-113
work_keys_str_mv AT exarchoskonstantinosp detectionofdiscriminativesequencepatternsintheneighborhoodofprolinecispeptidebondsandtheirfunctionalannotation
AT exarchosthemisp detectionofdiscriminativesequencepatternsintheneighborhoodofprolinecispeptidebondsandtheirfunctionalannotation
AT papaloukascostas detectionofdiscriminativesequencepatternsintheneighborhoodofprolinecispeptidebondsandtheirfunctionalannotation
AT troganisanastassiosn detectionofdiscriminativesequencepatternsintheneighborhoodofprolinecispeptidebondsandtheirfunctionalannotation
AT fotiadisdimitriosi detectionofdiscriminativesequencepatternsintheneighborhoodofprolinecispeptidebondsandtheirfunctionalannotation