Cargando…

Inductive Document Network Embedding with Topic-Word Attention

Document network embedding aims at learning representations for a structured text corpus i.e. when documents are linked to each other. Recent algorithms extend network embedding approaches by incorporating the text content associated with the nodes in their formulations. In most cases, it is hard to...

Descripción completa

Detalles Bibliográficos
Autores principales: Brochier, Robin, Guille, Adrien, Velcin, Julien
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7148210/
http://dx.doi.org/10.1007/978-3-030-45439-5_22
_version_ 1783520544164937728
author Brochier, Robin
Guille, Adrien
Velcin, Julien
author_facet Brochier, Robin
Guille, Adrien
Velcin, Julien
author_sort Brochier, Robin
collection PubMed
description Document network embedding aims at learning representations for a structured text corpus i.e. when documents are linked to each other. Recent algorithms extend network embedding approaches by incorporating the text content associated with the nodes in their formulations. In most cases, it is hard to interpret the learned representations. Moreover, little importance is given to the generalization to new documents that are not observed within the network. In this paper, we propose an interpretable and inductive document network embedding method. We introduce a novel mechanism, the Topic-Word Attention (TWA), that generates document representations based on the interplay between word and topic representations. We train these word and topic vectors through our general model, Inductive Document Network Embedding (IDNE), by leveraging the connections in the document network. Quantitative evaluations show that our approach achieves state-of-the-art performance on various networks and we qualitatively show that our model produces meaningful and interpretable representations of the words, topics and documents.
format Online
Article
Text
id pubmed-7148210
institution National Center for Biotechnology Information
language English
publishDate 2020
record_format MEDLINE/PubMed
spelling pubmed-71482102020-04-13 Inductive Document Network Embedding with Topic-Word Attention Brochier, Robin Guille, Adrien Velcin, Julien Advances in Information Retrieval Article Document network embedding aims at learning representations for a structured text corpus i.e. when documents are linked to each other. Recent algorithms extend network embedding approaches by incorporating the text content associated with the nodes in their formulations. In most cases, it is hard to interpret the learned representations. Moreover, little importance is given to the generalization to new documents that are not observed within the network. In this paper, we propose an interpretable and inductive document network embedding method. We introduce a novel mechanism, the Topic-Word Attention (TWA), that generates document representations based on the interplay between word and topic representations. We train these word and topic vectors through our general model, Inductive Document Network Embedding (IDNE), by leveraging the connections in the document network. Quantitative evaluations show that our approach achieves state-of-the-art performance on various networks and we qualitatively show that our model produces meaningful and interpretable representations of the words, topics and documents. 2020-03-17 /pmc/articles/PMC7148210/ http://dx.doi.org/10.1007/978-3-030-45439-5_22 Text en © Springer Nature Switzerland AG 2020 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle Article
Brochier, Robin
Guille, Adrien
Velcin, Julien
Inductive Document Network Embedding with Topic-Word Attention
title Inductive Document Network Embedding with Topic-Word Attention
title_full Inductive Document Network Embedding with Topic-Word Attention
title_fullStr Inductive Document Network Embedding with Topic-Word Attention
title_full_unstemmed Inductive Document Network Embedding with Topic-Word Attention
title_short Inductive Document Network Embedding with Topic-Word Attention
title_sort inductive document network embedding with topic-word attention
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7148210/
http://dx.doi.org/10.1007/978-3-030-45439-5_22
work_keys_str_mv AT brochierrobin inductivedocumentnetworkembeddingwithtopicwordattention
AT guilleadrien inductivedocumentnetworkembeddingwithtopicwordattention
AT velcinjulien inductivedocumentnetworkembeddingwithtopicwordattention