Cargando…

Accurately modeling biased random walks on weighted networks using node2vec+

MOTIVATION: Accurately representing biological networks in a low-dimensional space, also known as network embedding, is a critical step in network-based machine learning and is carried out widely using node2vec, an unsupervised method based on biased random walks. However, while many networks, inclu...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Renming, Hirn, Matthew, Krishnan, Arjun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9891245/
https://www.ncbi.nlm.nih.gov/pubmed/36688699
http://dx.doi.org/10.1093/bioinformatics/btad047
_version_ 1784881102314274816
author Liu, Renming
Hirn, Matthew
Krishnan, Arjun
author_facet Liu, Renming
Hirn, Matthew
Krishnan, Arjun
author_sort Liu, Renming
collection PubMed
description MOTIVATION: Accurately representing biological networks in a low-dimensional space, also known as network embedding, is a critical step in network-based machine learning and is carried out widely using node2vec, an unsupervised method based on biased random walks. However, while many networks, including functional gene interaction networks, are dense, weighted graphs, node2vec is fundamentally limited in its ability to use edge weights during the biased random walk generation process, thus under-using all the information in the network. RESULTS: Here, we present node2vec+, a natural extension of node2vec that accounts for edge weights when calculating walk biases and reduces to node2vec in the cases of unweighted graphs or unbiased walks. Using two synthetic datasets, we empirically show that node2vec+ is more robust to additive noise than node2vec in weighted graphs. Then, using genome-scale functional gene networks to solve a wide range of gene function and disease prediction tasks, we demonstrate the superior performance of node2vec+ over node2vec in the case of weighted graphs. Notably, due to the limited amount of training data in the gene classification tasks, graph neural networks such as GCN and GraphSAGE are outperformed by both node2vec and node2vec+. AVAILABILITY AND IMPLEMENTATION: The data and code are available on GitHub at https://github.com/krishnanlab/node2vecplus_benchmarks. All additional data underlying this article are available on Zenodo at https://doi.org/10.5281/zenodo.7007164. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-9891245
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-98912452023-02-02 Accurately modeling biased random walks on weighted networks using node2vec+ Liu, Renming Hirn, Matthew Krishnan, Arjun Bioinformatics Original Paper MOTIVATION: Accurately representing biological networks in a low-dimensional space, also known as network embedding, is a critical step in network-based machine learning and is carried out widely using node2vec, an unsupervised method based on biased random walks. However, while many networks, including functional gene interaction networks, are dense, weighted graphs, node2vec is fundamentally limited in its ability to use edge weights during the biased random walk generation process, thus under-using all the information in the network. RESULTS: Here, we present node2vec+, a natural extension of node2vec that accounts for edge weights when calculating walk biases and reduces to node2vec in the cases of unweighted graphs or unbiased walks. Using two synthetic datasets, we empirically show that node2vec+ is more robust to additive noise than node2vec in weighted graphs. Then, using genome-scale functional gene networks to solve a wide range of gene function and disease prediction tasks, we demonstrate the superior performance of node2vec+ over node2vec in the case of weighted graphs. Notably, due to the limited amount of training data in the gene classification tasks, graph neural networks such as GCN and GraphSAGE are outperformed by both node2vec and node2vec+. AVAILABILITY AND IMPLEMENTATION: The data and code are available on GitHub at https://github.com/krishnanlab/node2vecplus_benchmarks. All additional data underlying this article are available on Zenodo at https://doi.org/10.5281/zenodo.7007164. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2023-01-23 /pmc/articles/PMC9891245/ /pubmed/36688699 http://dx.doi.org/10.1093/bioinformatics/btad047 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Paper
Liu, Renming
Hirn, Matthew
Krishnan, Arjun
Accurately modeling biased random walks on weighted networks using node2vec+
title Accurately modeling biased random walks on weighted networks using node2vec+
title_full Accurately modeling biased random walks on weighted networks using node2vec+
title_fullStr Accurately modeling biased random walks on weighted networks using node2vec+
title_full_unstemmed Accurately modeling biased random walks on weighted networks using node2vec+
title_short Accurately modeling biased random walks on weighted networks using node2vec+
title_sort accurately modeling biased random walks on weighted networks using node2vec+
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9891245/
https://www.ncbi.nlm.nih.gov/pubmed/36688699
http://dx.doi.org/10.1093/bioinformatics/btad047
work_keys_str_mv AT liurenming accuratelymodelingbiasedrandomwalksonweightednetworksusingnode2vec
AT hirnmatthew accuratelymodelingbiasedrandomwalksonweightednetworksusingnode2vec
AT krishnanarjun accuratelymodelingbiasedrandomwalksonweightednetworksusingnode2vec