Cargando…
Detection of discriminative sequence patterns in the neighborhood of proline cis peptide bonds and their functional annotation
BACKGROUND: Polypeptides are composed of amino acids covalently bonded via a peptide bond. The majority of peptide bonds in proteins is found to occur in the trans conformation. In spite of their infrequent occurrence, cis peptide bonds play a key role in the protein structure and function, as well...
Autores principales: | , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2009
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2678097/ https://www.ncbi.nlm.nih.gov/pubmed/19379512 http://dx.doi.org/10.1186/1471-2105-10-113 |
_version_ | 1782166816918863872 |
---|---|
author | Exarchos, Konstantinos P Exarchos, Themis P Papaloukas, Costas Troganis, Anastassios N Fotiadis, Dimitrios I |
author_facet | Exarchos, Konstantinos P Exarchos, Themis P Papaloukas, Costas Troganis, Anastassios N Fotiadis, Dimitrios I |
author_sort | Exarchos, Konstantinos P |
collection | PubMed |
description | BACKGROUND: Polypeptides are composed of amino acids covalently bonded via a peptide bond. The majority of peptide bonds in proteins is found to occur in the trans conformation. In spite of their infrequent occurrence, cis peptide bonds play a key role in the protein structure and function, as well as in many significant biological processes. RESULTS: We perform a systematic analysis of regions in protein sequences that contain a proline cis peptide bond in order to discover non-random associations between the primary sequence and the nature of proline cis/trans isomerization. For this purpose an efficient pattern discovery algorithm is employed which discovers regular expression-type patterns that are overrepresented (i.e. appear frequently repeated) in a set of sequences. Four types of pattern discovery are performed: i) exact pattern discovery, ii) pattern discovery using a chemical equivalency set, iii) pattern discovery using a structural equivalency set and iv) pattern discovery using certain amino acids' physicochemical properties. The extracted patterns are carefully validated using a specially implemented scoring function and a significance measure (i.e. log-probability estimate) indicative of their specificity. The score threshold for the first three types of pattern discovery is 0.90 while for the last type of pattern discovery 0.80. Regarding the significance measure, all patterns yielded values in the range [-9, -31] which ensure that the derived patterns are highly unlikely to have emerged by chance. Among the highest scoring patterns, most of them are consistent with previous investigations concerning the neighborhood of cis proline peptide bonds, and many new ones are identified. Finally, the extracted patterns are systematically compared against the PROSITE database, in order to gain insight into the functional implications of cis prolyl bonds. CONCLUSION: Cis patterns with matches in the PROSITE database fell mostly into two main functional clusters: family signatures and protein signatures. However considerable propensity was also observed for targeting signals, active and phosphorylation sites as well as domain signatures. |
format | Text |
id | pubmed-2678097 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2009 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-26780972009-05-07 Detection of discriminative sequence patterns in the neighborhood of proline cis peptide bonds and their functional annotation Exarchos, Konstantinos P Exarchos, Themis P Papaloukas, Costas Troganis, Anastassios N Fotiadis, Dimitrios I BMC Bioinformatics Research Article BACKGROUND: Polypeptides are composed of amino acids covalently bonded via a peptide bond. The majority of peptide bonds in proteins is found to occur in the trans conformation. In spite of their infrequent occurrence, cis peptide bonds play a key role in the protein structure and function, as well as in many significant biological processes. RESULTS: We perform a systematic analysis of regions in protein sequences that contain a proline cis peptide bond in order to discover non-random associations between the primary sequence and the nature of proline cis/trans isomerization. For this purpose an efficient pattern discovery algorithm is employed which discovers regular expression-type patterns that are overrepresented (i.e. appear frequently repeated) in a set of sequences. Four types of pattern discovery are performed: i) exact pattern discovery, ii) pattern discovery using a chemical equivalency set, iii) pattern discovery using a structural equivalency set and iv) pattern discovery using certain amino acids' physicochemical properties. The extracted patterns are carefully validated using a specially implemented scoring function and a significance measure (i.e. log-probability estimate) indicative of their specificity. The score threshold for the first three types of pattern discovery is 0.90 while for the last type of pattern discovery 0.80. Regarding the significance measure, all patterns yielded values in the range [-9, -31] which ensure that the derived patterns are highly unlikely to have emerged by chance. Among the highest scoring patterns, most of them are consistent with previous investigations concerning the neighborhood of cis proline peptide bonds, and many new ones are identified. Finally, the extracted patterns are systematically compared against the PROSITE database, in order to gain insight into the functional implications of cis prolyl bonds. CONCLUSION: Cis patterns with matches in the PROSITE database fell mostly into two main functional clusters: family signatures and protein signatures. However considerable propensity was also observed for targeting signals, active and phosphorylation sites as well as domain signatures. BioMed Central 2009-04-20 /pmc/articles/PMC2678097/ /pubmed/19379512 http://dx.doi.org/10.1186/1471-2105-10-113 Text en Copyright © 2009 Exarchos et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Exarchos, Konstantinos P Exarchos, Themis P Papaloukas, Costas Troganis, Anastassios N Fotiadis, Dimitrios I Detection of discriminative sequence patterns in the neighborhood of proline cis peptide bonds and their functional annotation |
title | Detection of discriminative sequence patterns in the neighborhood of proline cis peptide bonds and their functional annotation |
title_full | Detection of discriminative sequence patterns in the neighborhood of proline cis peptide bonds and their functional annotation |
title_fullStr | Detection of discriminative sequence patterns in the neighborhood of proline cis peptide bonds and their functional annotation |
title_full_unstemmed | Detection of discriminative sequence patterns in the neighborhood of proline cis peptide bonds and their functional annotation |
title_short | Detection of discriminative sequence patterns in the neighborhood of proline cis peptide bonds and their functional annotation |
title_sort | detection of discriminative sequence patterns in the neighborhood of proline cis peptide bonds and their functional annotation |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2678097/ https://www.ncbi.nlm.nih.gov/pubmed/19379512 http://dx.doi.org/10.1186/1471-2105-10-113 |
work_keys_str_mv | AT exarchoskonstantinosp detectionofdiscriminativesequencepatternsintheneighborhoodofprolinecispeptidebondsandtheirfunctionalannotation AT exarchosthemisp detectionofdiscriminativesequencepatternsintheneighborhoodofprolinecispeptidebondsandtheirfunctionalannotation AT papaloukascostas detectionofdiscriminativesequencepatternsintheneighborhoodofprolinecispeptidebondsandtheirfunctionalannotation AT troganisanastassiosn detectionofdiscriminativesequencepatternsintheneighborhoodofprolinecispeptidebondsandtheirfunctionalannotation AT fotiadisdimitriosi detectionofdiscriminativesequencepatternsintheneighborhoodofprolinecispeptidebondsandtheirfunctionalannotation |