Cargando…
A generalized analysis of hydrophobic and loop clusters within globular protein sequences
BACKGROUND: Hydrophobic Cluster Analysis (HCA) is an efficient way to compare highly divergent sequences through the implicit secondary structure information directly derived from hydrophobic clusters. However, its efficiency and application are currently limited by the need of user expertise. In or...
Autores principales: | , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2007
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1774571/ https://www.ncbi.nlm.nih.gov/pubmed/17210072 http://dx.doi.org/10.1186/1472-6807-7-2 |
_version_ | 1782131724114722816 |
---|---|
author | Eudes, Richard Le Tuan, Khanh Delettré, Jean Mornon, Jean-Paul Callebaut, Isabelle |
author_facet | Eudes, Richard Le Tuan, Khanh Delettré, Jean Mornon, Jean-Paul Callebaut, Isabelle |
author_sort | Eudes, Richard |
collection | PubMed |
description | BACKGROUND: Hydrophobic Cluster Analysis (HCA) is an efficient way to compare highly divergent sequences through the implicit secondary structure information directly derived from hydrophobic clusters. However, its efficiency and application are currently limited by the need of user expertise. In order to help the analysis of HCA plots, we report here the structural preferences of hydrophobic cluster species, which are frequently encountered in globular domains of proteins. These species are characterized only by their hydrophobic/non-hydrophobic dichotomy. This analysis has been extended to loop-forming clusters, using an appropriate loop alphabet. RESULTS: The structural behavior of hydrophobic cluster species, which are typical of protein globular domains, was investigated within banks of experimental structures, considered at different levels of sequence redundancy. The 294 more frequent hydrophobic cluster species were analyzed with regard to their association with the different secondary structures (frequencies of association with secondary structures and secondary structure propensities). Hydrophobic cluster species are predominantly associated with regular secondary structures, and a large part (60 %) reveals preferences for α-helices or β-strands. Moreover, the analysis of the hydrophobic cluster amino acid composition generally allows for finer prediction of the regular secondary structure associated with the considered cluster within a cluster species. We also investigated the behavior of loop forming clusters, using a "PGDNS" alphabet. These loop clusters do not overlap with hydrophobic clusters and are highly associated with coils. Finally, the structural information contained in the hydrophobic structural words, as deduced from experimental structures, was compared to the PSI-PRED predictions, revealing that β-strands and especially α-helices are generally over-predicted within the limits of typical β and α hydrophobic clusters. CONCLUSION: The dictionary of hydrophobic clusters described here can help the HCA user to interpret and compare the HCA plots of globular protein sequences, as well as provides an original fundamental insight into the structural bricks of protein folds. Moreover, the novel loop cluster analysis brings additional information for secondary structure prediction on the whole sequence through a generalized cluster analysis (GCA), and not only on regular secondary structures. Such information lays the foundations for developing a new and original tool for secondary structure prediction. |
format | Text |
id | pubmed-1774571 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2007 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-17745712007-01-22 A generalized analysis of hydrophobic and loop clusters within globular protein sequences Eudes, Richard Le Tuan, Khanh Delettré, Jean Mornon, Jean-Paul Callebaut, Isabelle BMC Struct Biol Methodology Article BACKGROUND: Hydrophobic Cluster Analysis (HCA) is an efficient way to compare highly divergent sequences through the implicit secondary structure information directly derived from hydrophobic clusters. However, its efficiency and application are currently limited by the need of user expertise. In order to help the analysis of HCA plots, we report here the structural preferences of hydrophobic cluster species, which are frequently encountered in globular domains of proteins. These species are characterized only by their hydrophobic/non-hydrophobic dichotomy. This analysis has been extended to loop-forming clusters, using an appropriate loop alphabet. RESULTS: The structural behavior of hydrophobic cluster species, which are typical of protein globular domains, was investigated within banks of experimental structures, considered at different levels of sequence redundancy. The 294 more frequent hydrophobic cluster species were analyzed with regard to their association with the different secondary structures (frequencies of association with secondary structures and secondary structure propensities). Hydrophobic cluster species are predominantly associated with regular secondary structures, and a large part (60 %) reveals preferences for α-helices or β-strands. Moreover, the analysis of the hydrophobic cluster amino acid composition generally allows for finer prediction of the regular secondary structure associated with the considered cluster within a cluster species. We also investigated the behavior of loop forming clusters, using a "PGDNS" alphabet. These loop clusters do not overlap with hydrophobic clusters and are highly associated with coils. Finally, the structural information contained in the hydrophobic structural words, as deduced from experimental structures, was compared to the PSI-PRED predictions, revealing that β-strands and especially α-helices are generally over-predicted within the limits of typical β and α hydrophobic clusters. CONCLUSION: The dictionary of hydrophobic clusters described here can help the HCA user to interpret and compare the HCA plots of globular protein sequences, as well as provides an original fundamental insight into the structural bricks of protein folds. Moreover, the novel loop cluster analysis brings additional information for secondary structure prediction on the whole sequence through a generalized cluster analysis (GCA), and not only on regular secondary structures. Such information lays the foundations for developing a new and original tool for secondary structure prediction. BioMed Central 2007-01-08 /pmc/articles/PMC1774571/ /pubmed/17210072 http://dx.doi.org/10.1186/1472-6807-7-2 Text en Copyright © 2007 Eudes et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methodology Article Eudes, Richard Le Tuan, Khanh Delettré, Jean Mornon, Jean-Paul Callebaut, Isabelle A generalized analysis of hydrophobic and loop clusters within globular protein sequences |
title | A generalized analysis of hydrophobic and loop clusters within globular protein sequences |
title_full | A generalized analysis of hydrophobic and loop clusters within globular protein sequences |
title_fullStr | A generalized analysis of hydrophobic and loop clusters within globular protein sequences |
title_full_unstemmed | A generalized analysis of hydrophobic and loop clusters within globular protein sequences |
title_short | A generalized analysis of hydrophobic and loop clusters within globular protein sequences |
title_sort | generalized analysis of hydrophobic and loop clusters within globular protein sequences |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1774571/ https://www.ncbi.nlm.nih.gov/pubmed/17210072 http://dx.doi.org/10.1186/1472-6807-7-2 |
work_keys_str_mv | AT eudesrichard ageneralizedanalysisofhydrophobicandloopclusterswithinglobularproteinsequences AT letuankhanh ageneralizedanalysisofhydrophobicandloopclusterswithinglobularproteinsequences AT delettrejean ageneralizedanalysisofhydrophobicandloopclusterswithinglobularproteinsequences AT mornonjeanpaul ageneralizedanalysisofhydrophobicandloopclusterswithinglobularproteinsequences AT callebautisabelle ageneralizedanalysisofhydrophobicandloopclusterswithinglobularproteinsequences AT eudesrichard generalizedanalysisofhydrophobicandloopclusterswithinglobularproteinsequences AT letuankhanh generalizedanalysisofhydrophobicandloopclusterswithinglobularproteinsequences AT delettrejean generalizedanalysisofhydrophobicandloopclusterswithinglobularproteinsequences AT mornonjeanpaul generalizedanalysisofhydrophobicandloopclusterswithinglobularproteinsequences AT callebautisabelle generalizedanalysisofhydrophobicandloopclusterswithinglobularproteinsequences |