Cargando…
Combining Structure and Sequence Information Allows Automated Prediction of Substrate Specificities within Enzyme Families
An important aspect of the functional annotation of enzymes is not only the type of reaction catalysed by an enzyme, but also the substrate specificity, which can vary widely within the same family. In many cases, prediction of family membership and even substrate specificity is possible from enzyme...
Autores principales: | , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2010
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2796266/ https://www.ncbi.nlm.nih.gov/pubmed/20072606 http://dx.doi.org/10.1371/journal.pcbi.1000636 |
_version_ | 1782175520114343936 |
---|---|
author | Röttig, Marc Rausch, Christian Kohlbacher, Oliver |
author_facet | Röttig, Marc Rausch, Christian Kohlbacher, Oliver |
author_sort | Röttig, Marc |
collection | PubMed |
description | An important aspect of the functional annotation of enzymes is not only the type of reaction catalysed by an enzyme, but also the substrate specificity, which can vary widely within the same family. In many cases, prediction of family membership and even substrate specificity is possible from enzyme sequence alone, using a nearest neighbour classification rule. However, the combination of structural information and sequence information can improve the interpretability and accuracy of predictive models. The method presented here, Active Site Classification (ASC), automatically extracts the residues lining the active site from one representative three-dimensional structure and the corresponding residues from sequences of other members of the family. From a set of representatives with known substrate specificity, a Support Vector Machine (SVM) can then learn a model of substrate specificity. Applied to a sequence of unknown specificity, the SVM can then predict the most likely substrate. The models can also be analysed to reveal the underlying structural reasons determining substrate specificities and thus yield valuable insights into mechanisms of enzyme specificity. We illustrate the high prediction accuracy achieved on two benchmark data sets and the structural insights gained from ASC by a detailed analysis of the family of decarboxylating dehydrogenases. The ASC web service is available at http://asc.informatik.uni-tuebingen.de/. |
format | Text |
id | pubmed-2796266 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2010 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-27962662010-01-14 Combining Structure and Sequence Information Allows Automated Prediction of Substrate Specificities within Enzyme Families Röttig, Marc Rausch, Christian Kohlbacher, Oliver PLoS Comput Biol Research Article An important aspect of the functional annotation of enzymes is not only the type of reaction catalysed by an enzyme, but also the substrate specificity, which can vary widely within the same family. In many cases, prediction of family membership and even substrate specificity is possible from enzyme sequence alone, using a nearest neighbour classification rule. However, the combination of structural information and sequence information can improve the interpretability and accuracy of predictive models. The method presented here, Active Site Classification (ASC), automatically extracts the residues lining the active site from one representative three-dimensional structure and the corresponding residues from sequences of other members of the family. From a set of representatives with known substrate specificity, a Support Vector Machine (SVM) can then learn a model of substrate specificity. Applied to a sequence of unknown specificity, the SVM can then predict the most likely substrate. The models can also be analysed to reveal the underlying structural reasons determining substrate specificities and thus yield valuable insights into mechanisms of enzyme specificity. We illustrate the high prediction accuracy achieved on two benchmark data sets and the structural insights gained from ASC by a detailed analysis of the family of decarboxylating dehydrogenases. The ASC web service is available at http://asc.informatik.uni-tuebingen.de/. Public Library of Science 2010-01-08 /pmc/articles/PMC2796266/ /pubmed/20072606 http://dx.doi.org/10.1371/journal.pcbi.1000636 Text en Röttig et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Röttig, Marc Rausch, Christian Kohlbacher, Oliver Combining Structure and Sequence Information Allows Automated Prediction of Substrate Specificities within Enzyme Families |
title | Combining Structure and Sequence Information Allows Automated Prediction of Substrate Specificities within Enzyme Families |
title_full | Combining Structure and Sequence Information Allows Automated Prediction of Substrate Specificities within Enzyme Families |
title_fullStr | Combining Structure and Sequence Information Allows Automated Prediction of Substrate Specificities within Enzyme Families |
title_full_unstemmed | Combining Structure and Sequence Information Allows Automated Prediction of Substrate Specificities within Enzyme Families |
title_short | Combining Structure and Sequence Information Allows Automated Prediction of Substrate Specificities within Enzyme Families |
title_sort | combining structure and sequence information allows automated prediction of substrate specificities within enzyme families |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2796266/ https://www.ncbi.nlm.nih.gov/pubmed/20072606 http://dx.doi.org/10.1371/journal.pcbi.1000636 |
work_keys_str_mv | AT rottigmarc combiningstructureandsequenceinformationallowsautomatedpredictionofsubstratespecificitieswithinenzymefamilies AT rauschchristian combiningstructureandsequenceinformationallowsautomatedpredictionofsubstratespecificitieswithinenzymefamilies AT kohlbacheroliver combiningstructureandsequenceinformationallowsautomatedpredictionofsubstratespecificitieswithinenzymefamilies |