Cargando…

Evolutionarily consistent families in SCOP: sequence, structure and function

BACKGROUND: SCOP is a hierarchical domain classification system for proteins of known structure. The superfamily level has a clear definition: Protein domains belong to the same superfamily if there is structural, functional and sequence evidence for a common evolutionary ancestor. Superfamilies are...

Descripción completa

Detalles Bibliográficos
Autores principales: Pethica, Ralph B, Levitt, Michael, Gough, Julian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3495643/
https://www.ncbi.nlm.nih.gov/pubmed/23078280
http://dx.doi.org/10.1186/1472-6807-12-27
_version_ 1782249539870130176
author Pethica, Ralph B
Levitt, Michael
Gough, Julian
author_facet Pethica, Ralph B
Levitt, Michael
Gough, Julian
author_sort Pethica, Ralph B
collection PubMed
description BACKGROUND: SCOP is a hierarchical domain classification system for proteins of known structure. The superfamily level has a clear definition: Protein domains belong to the same superfamily if there is structural, functional and sequence evidence for a common evolutionary ancestor. Superfamilies are sub-classified into families, however, there is not such a clear basis for the family level groupings. Do SCOP families group together domains with sequence similarity, do they group domains with similar structure or by common function? It is these questions we answer, but most importantly, whether each family represents a distinct phylogenetic group within a superfamily. RESULTS: Several phylogenetic trees were generated for each superfamily: one derived from a multiple sequence alignment, one based on structural distances, and the final two from presence/absence of GO terms or EC numbers assigned to domains. The topologies of the resulting trees and confidence values were compared to the SCOP family classification. CONCLUSIONS: We show that SCOP family groupings are evolutionarily consistent to a very high degree with respect to classical sequence phylogenetics. The trees built from (automatically generated) structural distances correlate well, but are not always consistent with SCOP (hand annotated) groupings. Trees derived from functional data are less consistent with the family level than those from structure or sequence, though the majority still agree. Much of GO and EC annotation applies directly to one family or subset of the family; relatively few terms apply at the superfamily level. Maximum sequence diversity within a family is on average 22% but close to zero for superfamilies.
format Online
Article
Text
id pubmed-3495643
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-34956432012-11-13 Evolutionarily consistent families in SCOP: sequence, structure and function Pethica, Ralph B Levitt, Michael Gough, Julian BMC Struct Biol Research Article BACKGROUND: SCOP is a hierarchical domain classification system for proteins of known structure. The superfamily level has a clear definition: Protein domains belong to the same superfamily if there is structural, functional and sequence evidence for a common evolutionary ancestor. Superfamilies are sub-classified into families, however, there is not such a clear basis for the family level groupings. Do SCOP families group together domains with sequence similarity, do they group domains with similar structure or by common function? It is these questions we answer, but most importantly, whether each family represents a distinct phylogenetic group within a superfamily. RESULTS: Several phylogenetic trees were generated for each superfamily: one derived from a multiple sequence alignment, one based on structural distances, and the final two from presence/absence of GO terms or EC numbers assigned to domains. The topologies of the resulting trees and confidence values were compared to the SCOP family classification. CONCLUSIONS: We show that SCOP family groupings are evolutionarily consistent to a very high degree with respect to classical sequence phylogenetics. The trees built from (automatically generated) structural distances correlate well, but are not always consistent with SCOP (hand annotated) groupings. Trees derived from functional data are less consistent with the family level than those from structure or sequence, though the majority still agree. Much of GO and EC annotation applies directly to one family or subset of the family; relatively few terms apply at the superfamily level. Maximum sequence diversity within a family is on average 22% but close to zero for superfamilies. BioMed Central 2012-10-18 /pmc/articles/PMC3495643/ /pubmed/23078280 http://dx.doi.org/10.1186/1472-6807-12-27 Text en Copyright ©2012 Pethica et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Pethica, Ralph B
Levitt, Michael
Gough, Julian
Evolutionarily consistent families in SCOP: sequence, structure and function
title Evolutionarily consistent families in SCOP: sequence, structure and function
title_full Evolutionarily consistent families in SCOP: sequence, structure and function
title_fullStr Evolutionarily consistent families in SCOP: sequence, structure and function
title_full_unstemmed Evolutionarily consistent families in SCOP: sequence, structure and function
title_short Evolutionarily consistent families in SCOP: sequence, structure and function
title_sort evolutionarily consistent families in scop: sequence, structure and function
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3495643/
https://www.ncbi.nlm.nih.gov/pubmed/23078280
http://dx.doi.org/10.1186/1472-6807-12-27
work_keys_str_mv AT pethicaralphb evolutionarilyconsistentfamiliesinscopsequencestructureandfunction
AT levittmichael evolutionarilyconsistentfamiliesinscopsequencestructureandfunction
AT goughjulian evolutionarilyconsistentfamiliesinscopsequencestructureandfunction