Cargando…

An Algebro-Topological Description of Protein Domain Structure

The space of possible protein structures appears vast and continuous, and the relationship between primary, secondary and tertiary structure levels is complex. Protein structure comparison and classification is therefore a difficult but important task since structure is a determinant for molecular i...

Descripción completa

Detalles Bibliográficos
Autores principales: Penner, Robert Clark, Knudsen, Michael, Wiuf, Carsten, Andersen, Jørgen Ellegaard
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3101207/
https://www.ncbi.nlm.nih.gov/pubmed/21629687
http://dx.doi.org/10.1371/journal.pone.0019670
_version_ 1782204256496910336
author Penner, Robert Clark
Knudsen, Michael
Wiuf, Carsten
Andersen, Jørgen Ellegaard
author_facet Penner, Robert Clark
Knudsen, Michael
Wiuf, Carsten
Andersen, Jørgen Ellegaard
author_sort Penner, Robert Clark
collection PubMed
description The space of possible protein structures appears vast and continuous, and the relationship between primary, secondary and tertiary structure levels is complex. Protein structure comparison and classification is therefore a difficult but important task since structure is a determinant for molecular interaction and function. We introduce a novel mathematical abstraction based on geometric topology to describe protein domain structure. Using the locations of the backbone atoms and the hydrogen bonds, we build a combinatorial object – a so-called fatgraph. The description is discrete yet gives rise to a 2-dimensional mathematical surface. Thus, each protein domain corresponds to a particular mathematical surface with characteristic topological invariants, such as the genus (number of holes) and the number of boundary components. Both invariants are global fatgraph features reflecting the interconnectivity of the domain by hydrogen bonds. We introduce the notion of robust variables, that is variables that are robust towards minor changes in the structure/fatgraph, and show that the genus and the number of boundary components are robust. Further, we invesigate the distribution of different fatgraph variables and show how only four variables are capable of distinguishing different folds. We use local (secondary) and global (tertiary) fatgraph features to describe domain structures and illustrate that they are useful for classification of domains in CATH. In addition, we combine our method with two other methods thereby using primary, secondary, and tertiary structure information, and show that we can identify a large percentage of new and unclassified structures in CATH.
format Text
id pubmed-3101207
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-31012072011-05-31 An Algebro-Topological Description of Protein Domain Structure Penner, Robert Clark Knudsen, Michael Wiuf, Carsten Andersen, Jørgen Ellegaard PLoS One Research Article The space of possible protein structures appears vast and continuous, and the relationship between primary, secondary and tertiary structure levels is complex. Protein structure comparison and classification is therefore a difficult but important task since structure is a determinant for molecular interaction and function. We introduce a novel mathematical abstraction based on geometric topology to describe protein domain structure. Using the locations of the backbone atoms and the hydrogen bonds, we build a combinatorial object – a so-called fatgraph. The description is discrete yet gives rise to a 2-dimensional mathematical surface. Thus, each protein domain corresponds to a particular mathematical surface with characteristic topological invariants, such as the genus (number of holes) and the number of boundary components. Both invariants are global fatgraph features reflecting the interconnectivity of the domain by hydrogen bonds. We introduce the notion of robust variables, that is variables that are robust towards minor changes in the structure/fatgraph, and show that the genus and the number of boundary components are robust. Further, we invesigate the distribution of different fatgraph variables and show how only four variables are capable of distinguishing different folds. We use local (secondary) and global (tertiary) fatgraph features to describe domain structures and illustrate that they are useful for classification of domains in CATH. In addition, we combine our method with two other methods thereby using primary, secondary, and tertiary structure information, and show that we can identify a large percentage of new and unclassified structures in CATH. Public Library of Science 2011-05-24 /pmc/articles/PMC3101207/ /pubmed/21629687 http://dx.doi.org/10.1371/journal.pone.0019670 Text en Penner et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Penner, Robert Clark
Knudsen, Michael
Wiuf, Carsten
Andersen, Jørgen Ellegaard
An Algebro-Topological Description of Protein Domain Structure
title An Algebro-Topological Description of Protein Domain Structure
title_full An Algebro-Topological Description of Protein Domain Structure
title_fullStr An Algebro-Topological Description of Protein Domain Structure
title_full_unstemmed An Algebro-Topological Description of Protein Domain Structure
title_short An Algebro-Topological Description of Protein Domain Structure
title_sort algebro-topological description of protein domain structure
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3101207/
https://www.ncbi.nlm.nih.gov/pubmed/21629687
http://dx.doi.org/10.1371/journal.pone.0019670
work_keys_str_mv AT pennerrobertclark analgebrotopologicaldescriptionofproteindomainstructure
AT knudsenmichael analgebrotopologicaldescriptionofproteindomainstructure
AT wiufcarsten analgebrotopologicaldescriptionofproteindomainstructure
AT andersenjørgenellegaard analgebrotopologicaldescriptionofproteindomainstructure
AT pennerrobertclark algebrotopologicaldescriptionofproteindomainstructure
AT knudsenmichael algebrotopologicaldescriptionofproteindomainstructure
AT wiufcarsten algebrotopologicaldescriptionofproteindomainstructure
AT andersenjørgenellegaard algebrotopologicaldescriptionofproteindomainstructure