Cargando…
An Algebro-Topological Description of Protein Domain Structure
The space of possible protein structures appears vast and continuous, and the relationship between primary, secondary and tertiary structure levels is complex. Protein structure comparison and classification is therefore a difficult but important task since structure is a determinant for molecular i...
Autores principales: | , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2011
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3101207/ https://www.ncbi.nlm.nih.gov/pubmed/21629687 http://dx.doi.org/10.1371/journal.pone.0019670 |
_version_ | 1782204256496910336 |
---|---|
author | Penner, Robert Clark Knudsen, Michael Wiuf, Carsten Andersen, Jørgen Ellegaard |
author_facet | Penner, Robert Clark Knudsen, Michael Wiuf, Carsten Andersen, Jørgen Ellegaard |
author_sort | Penner, Robert Clark |
collection | PubMed |
description | The space of possible protein structures appears vast and continuous, and the relationship between primary, secondary and tertiary structure levels is complex. Protein structure comparison and classification is therefore a difficult but important task since structure is a determinant for molecular interaction and function. We introduce a novel mathematical abstraction based on geometric topology to describe protein domain structure. Using the locations of the backbone atoms and the hydrogen bonds, we build a combinatorial object – a so-called fatgraph. The description is discrete yet gives rise to a 2-dimensional mathematical surface. Thus, each protein domain corresponds to a particular mathematical surface with characteristic topological invariants, such as the genus (number of holes) and the number of boundary components. Both invariants are global fatgraph features reflecting the interconnectivity of the domain by hydrogen bonds. We introduce the notion of robust variables, that is variables that are robust towards minor changes in the structure/fatgraph, and show that the genus and the number of boundary components are robust. Further, we invesigate the distribution of different fatgraph variables and show how only four variables are capable of distinguishing different folds. We use local (secondary) and global (tertiary) fatgraph features to describe domain structures and illustrate that they are useful for classification of domains in CATH. In addition, we combine our method with two other methods thereby using primary, secondary, and tertiary structure information, and show that we can identify a large percentage of new and unclassified structures in CATH. |
format | Text |
id | pubmed-3101207 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2011 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-31012072011-05-31 An Algebro-Topological Description of Protein Domain Structure Penner, Robert Clark Knudsen, Michael Wiuf, Carsten Andersen, Jørgen Ellegaard PLoS One Research Article The space of possible protein structures appears vast and continuous, and the relationship between primary, secondary and tertiary structure levels is complex. Protein structure comparison and classification is therefore a difficult but important task since structure is a determinant for molecular interaction and function. We introduce a novel mathematical abstraction based on geometric topology to describe protein domain structure. Using the locations of the backbone atoms and the hydrogen bonds, we build a combinatorial object – a so-called fatgraph. The description is discrete yet gives rise to a 2-dimensional mathematical surface. Thus, each protein domain corresponds to a particular mathematical surface with characteristic topological invariants, such as the genus (number of holes) and the number of boundary components. Both invariants are global fatgraph features reflecting the interconnectivity of the domain by hydrogen bonds. We introduce the notion of robust variables, that is variables that are robust towards minor changes in the structure/fatgraph, and show that the genus and the number of boundary components are robust. Further, we invesigate the distribution of different fatgraph variables and show how only four variables are capable of distinguishing different folds. We use local (secondary) and global (tertiary) fatgraph features to describe domain structures and illustrate that they are useful for classification of domains in CATH. In addition, we combine our method with two other methods thereby using primary, secondary, and tertiary structure information, and show that we can identify a large percentage of new and unclassified structures in CATH. Public Library of Science 2011-05-24 /pmc/articles/PMC3101207/ /pubmed/21629687 http://dx.doi.org/10.1371/journal.pone.0019670 Text en Penner et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Penner, Robert Clark Knudsen, Michael Wiuf, Carsten Andersen, Jørgen Ellegaard An Algebro-Topological Description of Protein Domain Structure |
title | An Algebro-Topological Description of Protein Domain Structure |
title_full | An Algebro-Topological Description of Protein Domain Structure |
title_fullStr | An Algebro-Topological Description of Protein Domain Structure |
title_full_unstemmed | An Algebro-Topological Description of Protein Domain Structure |
title_short | An Algebro-Topological Description of Protein Domain Structure |
title_sort | algebro-topological description of protein domain structure |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3101207/ https://www.ncbi.nlm.nih.gov/pubmed/21629687 http://dx.doi.org/10.1371/journal.pone.0019670 |
work_keys_str_mv | AT pennerrobertclark analgebrotopologicaldescriptionofproteindomainstructure AT knudsenmichael analgebrotopologicaldescriptionofproteindomainstructure AT wiufcarsten analgebrotopologicaldescriptionofproteindomainstructure AT andersenjørgenellegaard analgebrotopologicaldescriptionofproteindomainstructure AT pennerrobertclark algebrotopologicaldescriptionofproteindomainstructure AT knudsenmichael algebrotopologicaldescriptionofproteindomainstructure AT wiufcarsten algebrotopologicaldescriptionofproteindomainstructure AT andersenjørgenellegaard algebrotopologicaldescriptionofproteindomainstructure |