Cargando…

Capturing coevolutionary signals inrepeat proteins

BACKGROUND: The analysis of correlations of amino acid occurrences in globular domains has led to the development of statistical tools that can identify native contacts – portions of the chains that come to close distance in folded structural ensembles. Here we introduce a direct coupling analysis f...

Descripción completa

Detalles Bibliográficos
Autores principales: Espada, Rocío, Parra, R Gonzalo, Mora, Thierry, Walczak, Aleksandra M, Ferreiro, Diego U
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4489039/
https://www.ncbi.nlm.nih.gov/pubmed/26134293
http://dx.doi.org/10.1186/s12859-015-0648-3
_version_ 1782379279811608576
author Espada, Rocío
Parra, R Gonzalo
Mora, Thierry
Walczak, Aleksandra M
Ferreiro, Diego U
author_facet Espada, Rocío
Parra, R Gonzalo
Mora, Thierry
Walczak, Aleksandra M
Ferreiro, Diego U
author_sort Espada, Rocío
collection PubMed
description BACKGROUND: The analysis of correlations of amino acid occurrences in globular domains has led to the development of statistical tools that can identify native contacts – portions of the chains that come to close distance in folded structural ensembles. Here we introduce a direct coupling analysis for repeat proteins – natural systems for which the identification of folding domains remains challenging. RESULTS: We show that the inherent translational symmetry of repeat protein sequences introduces a strong bias in the pair correlations at precisely the length scale of the repeat-unit. Equalizing for this bias in an objective way reveals true co-evolutionary signals from which local native contacts can be identified. Importantly, parameter values obtained for all other interactions are not significantly affected by the equalization. We quantify the robustness of the procedure and assign confidence levels to the interactions, identifying the minimum number of sequences needed to extract evolutionary information in several repeat protein families. CONCLUSIONS: The overall procedure can be used to reconstruct the interactions at distances larger than repeat-pairs, identifying the characteristics of the strongest couplings in each family, and can be applied to any system that appears translationally symmetric. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0648-3) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4489039
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-44890392015-07-03 Capturing coevolutionary signals inrepeat proteins Espada, Rocío Parra, R Gonzalo Mora, Thierry Walczak, Aleksandra M Ferreiro, Diego U BMC Bioinformatics Research Article BACKGROUND: The analysis of correlations of amino acid occurrences in globular domains has led to the development of statistical tools that can identify native contacts – portions of the chains that come to close distance in folded structural ensembles. Here we introduce a direct coupling analysis for repeat proteins – natural systems for which the identification of folding domains remains challenging. RESULTS: We show that the inherent translational symmetry of repeat protein sequences introduces a strong bias in the pair correlations at precisely the length scale of the repeat-unit. Equalizing for this bias in an objective way reveals true co-evolutionary signals from which local native contacts can be identified. Importantly, parameter values obtained for all other interactions are not significantly affected by the equalization. We quantify the robustness of the procedure and assign confidence levels to the interactions, identifying the minimum number of sequences needed to extract evolutionary information in several repeat protein families. CONCLUSIONS: The overall procedure can be used to reconstruct the interactions at distances larger than repeat-pairs, identifying the characteristics of the strongest couplings in each family, and can be applied to any system that appears translationally symmetric. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-015-0648-3) contains supplementary material, which is available to authorized users. BioMed Central 2015-07-02 /pmc/articles/PMC4489039/ /pubmed/26134293 http://dx.doi.org/10.1186/s12859-015-0648-3 Text en © Espada et al. 2015 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research Article
Espada, Rocío
Parra, R Gonzalo
Mora, Thierry
Walczak, Aleksandra M
Ferreiro, Diego U
Capturing coevolutionary signals inrepeat proteins
title Capturing coevolutionary signals inrepeat proteins
title_full Capturing coevolutionary signals inrepeat proteins
title_fullStr Capturing coevolutionary signals inrepeat proteins
title_full_unstemmed Capturing coevolutionary signals inrepeat proteins
title_short Capturing coevolutionary signals inrepeat proteins
title_sort capturing coevolutionary signals inrepeat proteins
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4489039/
https://www.ncbi.nlm.nih.gov/pubmed/26134293
http://dx.doi.org/10.1186/s12859-015-0648-3
work_keys_str_mv AT espadarocio capturingcoevolutionarysignalsinrepeatproteins
AT parrargonzalo capturingcoevolutionarysignalsinrepeatproteins
AT morathierry capturingcoevolutionarysignalsinrepeatproteins
AT walczakaleksandram capturingcoevolutionarysignalsinrepeatproteins
AT ferreirodiegou capturingcoevolutionarysignalsinrepeatproteins