Cargando…

Predicting the network of substrate-enzyme-product triads by combining compound similarity and functional domain composition

BACKGROUND: Metabolic pathway is a highly regulated network consisting of many metabolic reactions involving substrates, enzymes, and products, where substrates can be transformed into products with particular catalytic enzymes. Since experimental determination of the network of substrate-enzyme-pro...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Lei, Feng, Kai-Yan, Cai, Yu-Dong, Chou, Kuo-Chen, Li, Hai-Peng
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3098070/
https://www.ncbi.nlm.nih.gov/pubmed/20513238
http://dx.doi.org/10.1186/1471-2105-11-293
_version_ 1782203910461587456
author Chen, Lei
Feng, Kai-Yan
Cai, Yu-Dong
Chou, Kuo-Chen
Li, Hai-Peng
author_facet Chen, Lei
Feng, Kai-Yan
Cai, Yu-Dong
Chou, Kuo-Chen
Li, Hai-Peng
author_sort Chen, Lei
collection PubMed
description BACKGROUND: Metabolic pathway is a highly regulated network consisting of many metabolic reactions involving substrates, enzymes, and products, where substrates can be transformed into products with particular catalytic enzymes. Since experimental determination of the network of substrate-enzyme-product triad (whether the substrate can be transformed into the product with a given enzyme) is both time-consuming and expensive, it would be very useful to develop a computational approach for predicting the network of substrate-enzyme-product triads. RESULTS: A mathematical model for predicting the network of substrate-enzyme-product triads was developed. Meanwhile, a benchmark dataset was constructed that contains 744,192 substrate-enzyme-product triads, of which 14,592 are networking triads, and 729,600 are non-networking triads; i.e., the number of the negative triads was about 50 times the number of the positive triads. The molecular graph was introduced to calculate the similarity between the substrate compounds and between the product compounds, while the functional domain composition was introduced to calculate the similarity between enzyme molecules. The nearest neighbour algorithm was utilized as a prediction engine, in which a novel metric was introduced to measure the "nearness" between triads. To train and test the prediction engine, one tenth of the positive triads and one tenth of the negative triads were randomly picked from the benchmark dataset as the testing samples, while the remaining were used to train the prediction model. It was observed that the overall success rate in predicting the network for the testing samples was 98.71%, with 95.41% success rate for the 1,460 testing networking triads and 98.77% for the 72,960 testing non-networking triads. CONCLUSIONS: It is quite promising and encouraged to use the molecular graph to calculate the similarity between compounds and use the functional domain composition to calculate the similarity between enzymes for studying the substrate-enzyme-product network system. The software is available upon request.
format Text
id pubmed-3098070
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-30980702011-05-20 Predicting the network of substrate-enzyme-product triads by combining compound similarity and functional domain composition Chen, Lei Feng, Kai-Yan Cai, Yu-Dong Chou, Kuo-Chen Li, Hai-Peng BMC Bioinformatics Research Article BACKGROUND: Metabolic pathway is a highly regulated network consisting of many metabolic reactions involving substrates, enzymes, and products, where substrates can be transformed into products with particular catalytic enzymes. Since experimental determination of the network of substrate-enzyme-product triad (whether the substrate can be transformed into the product with a given enzyme) is both time-consuming and expensive, it would be very useful to develop a computational approach for predicting the network of substrate-enzyme-product triads. RESULTS: A mathematical model for predicting the network of substrate-enzyme-product triads was developed. Meanwhile, a benchmark dataset was constructed that contains 744,192 substrate-enzyme-product triads, of which 14,592 are networking triads, and 729,600 are non-networking triads; i.e., the number of the negative triads was about 50 times the number of the positive triads. The molecular graph was introduced to calculate the similarity between the substrate compounds and between the product compounds, while the functional domain composition was introduced to calculate the similarity between enzyme molecules. The nearest neighbour algorithm was utilized as a prediction engine, in which a novel metric was introduced to measure the "nearness" between triads. To train and test the prediction engine, one tenth of the positive triads and one tenth of the negative triads were randomly picked from the benchmark dataset as the testing samples, while the remaining were used to train the prediction model. It was observed that the overall success rate in predicting the network for the testing samples was 98.71%, with 95.41% success rate for the 1,460 testing networking triads and 98.77% for the 72,960 testing non-networking triads. CONCLUSIONS: It is quite promising and encouraged to use the molecular graph to calculate the similarity between compounds and use the functional domain composition to calculate the similarity between enzymes for studying the substrate-enzyme-product network system. The software is available upon request. BioMed Central 2010-05-31 /pmc/articles/PMC3098070/ /pubmed/20513238 http://dx.doi.org/10.1186/1471-2105-11-293 Text en Copyright ©2010 Chen et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Chen, Lei
Feng, Kai-Yan
Cai, Yu-Dong
Chou, Kuo-Chen
Li, Hai-Peng
Predicting the network of substrate-enzyme-product triads by combining compound similarity and functional domain composition
title Predicting the network of substrate-enzyme-product triads by combining compound similarity and functional domain composition
title_full Predicting the network of substrate-enzyme-product triads by combining compound similarity and functional domain composition
title_fullStr Predicting the network of substrate-enzyme-product triads by combining compound similarity and functional domain composition
title_full_unstemmed Predicting the network of substrate-enzyme-product triads by combining compound similarity and functional domain composition
title_short Predicting the network of substrate-enzyme-product triads by combining compound similarity and functional domain composition
title_sort predicting the network of substrate-enzyme-product triads by combining compound similarity and functional domain composition
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3098070/
https://www.ncbi.nlm.nih.gov/pubmed/20513238
http://dx.doi.org/10.1186/1471-2105-11-293
work_keys_str_mv AT chenlei predictingthenetworkofsubstrateenzymeproducttriadsbycombiningcompoundsimilarityandfunctionaldomaincomposition
AT fengkaiyan predictingthenetworkofsubstrateenzymeproducttriadsbycombiningcompoundsimilarityandfunctionaldomaincomposition
AT caiyudong predictingthenetworkofsubstrateenzymeproducttriadsbycombiningcompoundsimilarityandfunctionaldomaincomposition
AT choukuochen predictingthenetworkofsubstrateenzymeproducttriadsbycombiningcompoundsimilarityandfunctionaldomaincomposition
AT lihaipeng predictingthenetworkofsubstrateenzymeproducttriadsbycombiningcompoundsimilarityandfunctionaldomaincomposition