Cargando…

A generalized analysis of hydrophobic and loop clusters within globular protein sequences

BACKGROUND: Hydrophobic Cluster Analysis (HCA) is an efficient way to compare highly divergent sequences through the implicit secondary structure information directly derived from hydrophobic clusters. However, its efficiency and application are currently limited by the need of user expertise. In or...

Descripción completa

Detalles Bibliográficos
Autores principales: Eudes, Richard, Le Tuan, Khanh, Delettré, Jean, Mornon, Jean-Paul, Callebaut, Isabelle
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1774571/
https://www.ncbi.nlm.nih.gov/pubmed/17210072
http://dx.doi.org/10.1186/1472-6807-7-2
_version_ 1782131724114722816
author Eudes, Richard
Le Tuan, Khanh
Delettré, Jean
Mornon, Jean-Paul
Callebaut, Isabelle
author_facet Eudes, Richard
Le Tuan, Khanh
Delettré, Jean
Mornon, Jean-Paul
Callebaut, Isabelle
author_sort Eudes, Richard
collection PubMed
description BACKGROUND: Hydrophobic Cluster Analysis (HCA) is an efficient way to compare highly divergent sequences through the implicit secondary structure information directly derived from hydrophobic clusters. However, its efficiency and application are currently limited by the need of user expertise. In order to help the analysis of HCA plots, we report here the structural preferences of hydrophobic cluster species, which are frequently encountered in globular domains of proteins. These species are characterized only by their hydrophobic/non-hydrophobic dichotomy. This analysis has been extended to loop-forming clusters, using an appropriate loop alphabet. RESULTS: The structural behavior of hydrophobic cluster species, which are typical of protein globular domains, was investigated within banks of experimental structures, considered at different levels of sequence redundancy. The 294 more frequent hydrophobic cluster species were analyzed with regard to their association with the different secondary structures (frequencies of association with secondary structures and secondary structure propensities). Hydrophobic cluster species are predominantly associated with regular secondary structures, and a large part (60 %) reveals preferences for α-helices or β-strands. Moreover, the analysis of the hydrophobic cluster amino acid composition generally allows for finer prediction of the regular secondary structure associated with the considered cluster within a cluster species. We also investigated the behavior of loop forming clusters, using a "PGDNS" alphabet. These loop clusters do not overlap with hydrophobic clusters and are highly associated with coils. Finally, the structural information contained in the hydrophobic structural words, as deduced from experimental structures, was compared to the PSI-PRED predictions, revealing that β-strands and especially α-helices are generally over-predicted within the limits of typical β and α hydrophobic clusters. CONCLUSION: The dictionary of hydrophobic clusters described here can help the HCA user to interpret and compare the HCA plots of globular protein sequences, as well as provides an original fundamental insight into the structural bricks of protein folds. Moreover, the novel loop cluster analysis brings additional information for secondary structure prediction on the whole sequence through a generalized cluster analysis (GCA), and not only on regular secondary structures. Such information lays the foundations for developing a new and original tool for secondary structure prediction.
format Text
id pubmed-1774571
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-17745712007-01-22 A generalized analysis of hydrophobic and loop clusters within globular protein sequences Eudes, Richard Le Tuan, Khanh Delettré, Jean Mornon, Jean-Paul Callebaut, Isabelle BMC Struct Biol Methodology Article BACKGROUND: Hydrophobic Cluster Analysis (HCA) is an efficient way to compare highly divergent sequences through the implicit secondary structure information directly derived from hydrophobic clusters. However, its efficiency and application are currently limited by the need of user expertise. In order to help the analysis of HCA plots, we report here the structural preferences of hydrophobic cluster species, which are frequently encountered in globular domains of proteins. These species are characterized only by their hydrophobic/non-hydrophobic dichotomy. This analysis has been extended to loop-forming clusters, using an appropriate loop alphabet. RESULTS: The structural behavior of hydrophobic cluster species, which are typical of protein globular domains, was investigated within banks of experimental structures, considered at different levels of sequence redundancy. The 294 more frequent hydrophobic cluster species were analyzed with regard to their association with the different secondary structures (frequencies of association with secondary structures and secondary structure propensities). Hydrophobic cluster species are predominantly associated with regular secondary structures, and a large part (60 %) reveals preferences for α-helices or β-strands. Moreover, the analysis of the hydrophobic cluster amino acid composition generally allows for finer prediction of the regular secondary structure associated with the considered cluster within a cluster species. We also investigated the behavior of loop forming clusters, using a "PGDNS" alphabet. These loop clusters do not overlap with hydrophobic clusters and are highly associated with coils. Finally, the structural information contained in the hydrophobic structural words, as deduced from experimental structures, was compared to the PSI-PRED predictions, revealing that β-strands and especially α-helices are generally over-predicted within the limits of typical β and α hydrophobic clusters. CONCLUSION: The dictionary of hydrophobic clusters described here can help the HCA user to interpret and compare the HCA plots of globular protein sequences, as well as provides an original fundamental insight into the structural bricks of protein folds. Moreover, the novel loop cluster analysis brings additional information for secondary structure prediction on the whole sequence through a generalized cluster analysis (GCA), and not only on regular secondary structures. Such information lays the foundations for developing a new and original tool for secondary structure prediction. BioMed Central 2007-01-08 /pmc/articles/PMC1774571/ /pubmed/17210072 http://dx.doi.org/10.1186/1472-6807-7-2 Text en Copyright © 2007 Eudes et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Eudes, Richard
Le Tuan, Khanh
Delettré, Jean
Mornon, Jean-Paul
Callebaut, Isabelle
A generalized analysis of hydrophobic and loop clusters within globular protein sequences
title A generalized analysis of hydrophobic and loop clusters within globular protein sequences
title_full A generalized analysis of hydrophobic and loop clusters within globular protein sequences
title_fullStr A generalized analysis of hydrophobic and loop clusters within globular protein sequences
title_full_unstemmed A generalized analysis of hydrophobic and loop clusters within globular protein sequences
title_short A generalized analysis of hydrophobic and loop clusters within globular protein sequences
title_sort generalized analysis of hydrophobic and loop clusters within globular protein sequences
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1774571/
https://www.ncbi.nlm.nih.gov/pubmed/17210072
http://dx.doi.org/10.1186/1472-6807-7-2
work_keys_str_mv AT eudesrichard ageneralizedanalysisofhydrophobicandloopclusterswithinglobularproteinsequences
AT letuankhanh ageneralizedanalysisofhydrophobicandloopclusterswithinglobularproteinsequences
AT delettrejean ageneralizedanalysisofhydrophobicandloopclusterswithinglobularproteinsequences
AT mornonjeanpaul ageneralizedanalysisofhydrophobicandloopclusterswithinglobularproteinsequences
AT callebautisabelle ageneralizedanalysisofhydrophobicandloopclusterswithinglobularproteinsequences
AT eudesrichard generalizedanalysisofhydrophobicandloopclusterswithinglobularproteinsequences
AT letuankhanh generalizedanalysisofhydrophobicandloopclusterswithinglobularproteinsequences
AT delettrejean generalizedanalysisofhydrophobicandloopclusterswithinglobularproteinsequences
AT mornonjeanpaul generalizedanalysisofhydrophobicandloopclusterswithinglobularproteinsequences
AT callebautisabelle generalizedanalysisofhydrophobicandloopclusterswithinglobularproteinsequences