Cargando…
Principled approach to the selection of the embedding dimension of networks
Network embedding is a general-purpose machine learning technique that encodes network structure in vector spaces with tunable dimension. Choosing an appropriate embedding dimension – small enough to be efficient and large enough to be effective – is challenging but necessary to generate embeddings...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8213704/ https://www.ncbi.nlm.nih.gov/pubmed/34145234 http://dx.doi.org/10.1038/s41467-021-23795-5 |
_version_ | 1783709907851149312 |
---|---|
author | Gu, Weiwei Tandon, Aditya Ahn, Yong-Yeol Radicchi, Filippo |
author_facet | Gu, Weiwei Tandon, Aditya Ahn, Yong-Yeol Radicchi, Filippo |
author_sort | Gu, Weiwei |
collection | PubMed |
description | Network embedding is a general-purpose machine learning technique that encodes network structure in vector spaces with tunable dimension. Choosing an appropriate embedding dimension – small enough to be efficient and large enough to be effective – is challenging but necessary to generate embeddings applicable to a multitude of tasks. Existing strategies for the selection of the embedding dimension rely on performance maximization in downstream tasks. Here, we propose a principled method such that all structural information of a network is parsimoniously encoded. The method is validated on various embedding algorithms and a large corpus of real-world networks. The embedding dimension selected by our method in real-world networks suggest that efficient encoding in low-dimensional spaces is usually possible. |
format | Online Article Text |
id | pubmed-8213704 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-82137042021-07-01 Principled approach to the selection of the embedding dimension of networks Gu, Weiwei Tandon, Aditya Ahn, Yong-Yeol Radicchi, Filippo Nat Commun Article Network embedding is a general-purpose machine learning technique that encodes network structure in vector spaces with tunable dimension. Choosing an appropriate embedding dimension – small enough to be efficient and large enough to be effective – is challenging but necessary to generate embeddings applicable to a multitude of tasks. Existing strategies for the selection of the embedding dimension rely on performance maximization in downstream tasks. Here, we propose a principled method such that all structural information of a network is parsimoniously encoded. The method is validated on various embedding algorithms and a large corpus of real-world networks. The embedding dimension selected by our method in real-world networks suggest that efficient encoding in low-dimensional spaces is usually possible. Nature Publishing Group UK 2021-06-18 /pmc/articles/PMC8213704/ /pubmed/34145234 http://dx.doi.org/10.1038/s41467-021-23795-5 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Gu, Weiwei Tandon, Aditya Ahn, Yong-Yeol Radicchi, Filippo Principled approach to the selection of the embedding dimension of networks |
title | Principled approach to the selection of the embedding dimension of networks |
title_full | Principled approach to the selection of the embedding dimension of networks |
title_fullStr | Principled approach to the selection of the embedding dimension of networks |
title_full_unstemmed | Principled approach to the selection of the embedding dimension of networks |
title_short | Principled approach to the selection of the embedding dimension of networks |
title_sort | principled approach to the selection of the embedding dimension of networks |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8213704/ https://www.ncbi.nlm.nih.gov/pubmed/34145234 http://dx.doi.org/10.1038/s41467-021-23795-5 |
work_keys_str_mv | AT guweiwei principledapproachtotheselectionoftheembeddingdimensionofnetworks AT tandonaditya principledapproachtotheselectionoftheembeddingdimensionofnetworks AT ahnyongyeol principledapproachtotheselectionoftheembeddingdimensionofnetworks AT radicchifilippo principledapproachtotheselectionoftheembeddingdimensionofnetworks |