Cargando…
Learning graph representations of biochemical networks and its application to enzymatic link prediction
MOTIVATION: The complete characterization of enzymatic activities between molecules remains incomplete, hindering biological engineering and limiting biological discovery. We develop in this work a technique, enzymatic link prediction (ELP), for predicting the likelihood of an enzymatic transformati...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8097755/ https://www.ncbi.nlm.nih.gov/pubmed/33051674 http://dx.doi.org/10.1093/bioinformatics/btaa881 |
_version_ | 1783688376682020864 |
---|---|
author | Jiang, Julie Liu, Li-Ping Hassoun, Soha |
author_facet | Jiang, Julie Liu, Li-Ping Hassoun, Soha |
author_sort | Jiang, Julie |
collection | PubMed |
description | MOTIVATION: The complete characterization of enzymatic activities between molecules remains incomplete, hindering biological engineering and limiting biological discovery. We develop in this work a technique, enzymatic link prediction (ELP), for predicting the likelihood of an enzymatic transformation between two molecules. ELP models enzymatic reactions cataloged in the KEGG database as a graph. ELP is innovative over prior works in using graph embedding to learn molecular representations that capture not only molecular and enzymatic attributes but also graph connectivity. RESULTS: We explore transductive (test nodes included in the training graph) and inductive (test nodes not part of the training graph) learning models. We show that ELP achieves high AUC when learning node embeddings using both graph connectivity and node attributes. Further, we show that graph embedding improves link prediction by 30% in area under curve over fingerprint-based similarity approaches and by 8% over support vector machines. We compare ELP against rule-based methods. We also evaluate ELP for predicting links in pathway maps and for reconstruction of edges in reaction networks of four common gut microbiota phyla: actinobacteria, bacteroidetes, firmicutes and proteobacteria. To emphasize the importance of graph embedding in the context of biochemical networks, we illustrate how graph embedding can guide visualization. AVAILABILITY AND IMPLEMENTATION: The code and datasets are available through https://github.com/HassounLab/ELP. |
format | Online Article Text |
id | pubmed-8097755 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-80977552021-05-10 Learning graph representations of biochemical networks and its application to enzymatic link prediction Jiang, Julie Liu, Li-Ping Hassoun, Soha Bioinformatics Original Papers MOTIVATION: The complete characterization of enzymatic activities between molecules remains incomplete, hindering biological engineering and limiting biological discovery. We develop in this work a technique, enzymatic link prediction (ELP), for predicting the likelihood of an enzymatic transformation between two molecules. ELP models enzymatic reactions cataloged in the KEGG database as a graph. ELP is innovative over prior works in using graph embedding to learn molecular representations that capture not only molecular and enzymatic attributes but also graph connectivity. RESULTS: We explore transductive (test nodes included in the training graph) and inductive (test nodes not part of the training graph) learning models. We show that ELP achieves high AUC when learning node embeddings using both graph connectivity and node attributes. Further, we show that graph embedding improves link prediction by 30% in area under curve over fingerprint-based similarity approaches and by 8% over support vector machines. We compare ELP against rule-based methods. We also evaluate ELP for predicting links in pathway maps and for reconstruction of edges in reaction networks of four common gut microbiota phyla: actinobacteria, bacteroidetes, firmicutes and proteobacteria. To emphasize the importance of graph embedding in the context of biochemical networks, we illustrate how graph embedding can guide visualization. AVAILABILITY AND IMPLEMENTATION: The code and datasets are available through https://github.com/HassounLab/ELP. Oxford University Press 2020-10-14 /pmc/articles/PMC8097755/ /pubmed/33051674 http://dx.doi.org/10.1093/bioinformatics/btaa881 Text en © The Author(s) 2020. Published by Oxford University Press. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) ), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Original Papers Jiang, Julie Liu, Li-Ping Hassoun, Soha Learning graph representations of biochemical networks and its application to enzymatic link prediction |
title | Learning graph representations of biochemical networks and its application to enzymatic link prediction |
title_full | Learning graph representations of biochemical networks and its application to enzymatic link prediction |
title_fullStr | Learning graph representations of biochemical networks and its application to enzymatic link prediction |
title_full_unstemmed | Learning graph representations of biochemical networks and its application to enzymatic link prediction |
title_short | Learning graph representations of biochemical networks and its application to enzymatic link prediction |
title_sort | learning graph representations of biochemical networks and its application to enzymatic link prediction |
topic | Original Papers |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8097755/ https://www.ncbi.nlm.nih.gov/pubmed/33051674 http://dx.doi.org/10.1093/bioinformatics/btaa881 |
work_keys_str_mv | AT jiangjulie learninggraphrepresentationsofbiochemicalnetworksanditsapplicationtoenzymaticlinkprediction AT liuliping learninggraphrepresentationsofbiochemicalnetworksanditsapplicationtoenzymaticlinkprediction AT hassounsoha learninggraphrepresentationsofbiochemicalnetworksanditsapplicationtoenzymaticlinkprediction |