Cargando…

Combining Structure and Sequence Information Allows Automated Prediction of Substrate Specificities within Enzyme Families

An important aspect of the functional annotation of enzymes is not only the type of reaction catalysed by an enzyme, but also the substrate specificity, which can vary widely within the same family. In many cases, prediction of family membership and even substrate specificity is possible from enzyme...

Descripción completa

Detalles Bibliográficos
Autores principales: Röttig, Marc, Rausch, Christian, Kohlbacher, Oliver
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2796266/
https://www.ncbi.nlm.nih.gov/pubmed/20072606
http://dx.doi.org/10.1371/journal.pcbi.1000636
_version_ 1782175520114343936
author Röttig, Marc
Rausch, Christian
Kohlbacher, Oliver
author_facet Röttig, Marc
Rausch, Christian
Kohlbacher, Oliver
author_sort Röttig, Marc
collection PubMed
description An important aspect of the functional annotation of enzymes is not only the type of reaction catalysed by an enzyme, but also the substrate specificity, which can vary widely within the same family. In many cases, prediction of family membership and even substrate specificity is possible from enzyme sequence alone, using a nearest neighbour classification rule. However, the combination of structural information and sequence information can improve the interpretability and accuracy of predictive models. The method presented here, Active Site Classification (ASC), automatically extracts the residues lining the active site from one representative three-dimensional structure and the corresponding residues from sequences of other members of the family. From a set of representatives with known substrate specificity, a Support Vector Machine (SVM) can then learn a model of substrate specificity. Applied to a sequence of unknown specificity, the SVM can then predict the most likely substrate. The models can also be analysed to reveal the underlying structural reasons determining substrate specificities and thus yield valuable insights into mechanisms of enzyme specificity. We illustrate the high prediction accuracy achieved on two benchmark data sets and the structural insights gained from ASC by a detailed analysis of the family of decarboxylating dehydrogenases. The ASC web service is available at http://asc.informatik.uni-tuebingen.de/.
format Text
id pubmed-2796266
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-27962662010-01-14 Combining Structure and Sequence Information Allows Automated Prediction of Substrate Specificities within Enzyme Families Röttig, Marc Rausch, Christian Kohlbacher, Oliver PLoS Comput Biol Research Article An important aspect of the functional annotation of enzymes is not only the type of reaction catalysed by an enzyme, but also the substrate specificity, which can vary widely within the same family. In many cases, prediction of family membership and even substrate specificity is possible from enzyme sequence alone, using a nearest neighbour classification rule. However, the combination of structural information and sequence information can improve the interpretability and accuracy of predictive models. The method presented here, Active Site Classification (ASC), automatically extracts the residues lining the active site from one representative three-dimensional structure and the corresponding residues from sequences of other members of the family. From a set of representatives with known substrate specificity, a Support Vector Machine (SVM) can then learn a model of substrate specificity. Applied to a sequence of unknown specificity, the SVM can then predict the most likely substrate. The models can also be analysed to reveal the underlying structural reasons determining substrate specificities and thus yield valuable insights into mechanisms of enzyme specificity. We illustrate the high prediction accuracy achieved on two benchmark data sets and the structural insights gained from ASC by a detailed analysis of the family of decarboxylating dehydrogenases. The ASC web service is available at http://asc.informatik.uni-tuebingen.de/. Public Library of Science 2010-01-08 /pmc/articles/PMC2796266/ /pubmed/20072606 http://dx.doi.org/10.1371/journal.pcbi.1000636 Text en Röttig et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Röttig, Marc
Rausch, Christian
Kohlbacher, Oliver
Combining Structure and Sequence Information Allows Automated Prediction of Substrate Specificities within Enzyme Families
title Combining Structure and Sequence Information Allows Automated Prediction of Substrate Specificities within Enzyme Families
title_full Combining Structure and Sequence Information Allows Automated Prediction of Substrate Specificities within Enzyme Families
title_fullStr Combining Structure and Sequence Information Allows Automated Prediction of Substrate Specificities within Enzyme Families
title_full_unstemmed Combining Structure and Sequence Information Allows Automated Prediction of Substrate Specificities within Enzyme Families
title_short Combining Structure and Sequence Information Allows Automated Prediction of Substrate Specificities within Enzyme Families
title_sort combining structure and sequence information allows automated prediction of substrate specificities within enzyme families
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2796266/
https://www.ncbi.nlm.nih.gov/pubmed/20072606
http://dx.doi.org/10.1371/journal.pcbi.1000636
work_keys_str_mv AT rottigmarc combiningstructureandsequenceinformationallowsautomatedpredictionofsubstratespecificitieswithinenzymefamilies
AT rauschchristian combiningstructureandsequenceinformationallowsautomatedpredictionofsubstratespecificitieswithinenzymefamilies
AT kohlbacheroliver combiningstructureandsequenceinformationallowsautomatedpredictionofsubstratespecificitieswithinenzymefamilies