Cargando…
Inductive Document Network Embedding with Topic-Word Attention
Document network embedding aims at learning representations for a structured text corpus i.e. when documents are linked to each other. Recent algorithms extend network embedding approaches by incorporating the text content associated with the nodes in their formulations. In most cases, it is hard to...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7148210/ http://dx.doi.org/10.1007/978-3-030-45439-5_22 |
_version_ | 1783520544164937728 |
---|---|
author | Brochier, Robin Guille, Adrien Velcin, Julien |
author_facet | Brochier, Robin Guille, Adrien Velcin, Julien |
author_sort | Brochier, Robin |
collection | PubMed |
description | Document network embedding aims at learning representations for a structured text corpus i.e. when documents are linked to each other. Recent algorithms extend network embedding approaches by incorporating the text content associated with the nodes in their formulations. In most cases, it is hard to interpret the learned representations. Moreover, little importance is given to the generalization to new documents that are not observed within the network. In this paper, we propose an interpretable and inductive document network embedding method. We introduce a novel mechanism, the Topic-Word Attention (TWA), that generates document representations based on the interplay between word and topic representations. We train these word and topic vectors through our general model, Inductive Document Network Embedding (IDNE), by leveraging the connections in the document network. Quantitative evaluations show that our approach achieves state-of-the-art performance on various networks and we qualitatively show that our model produces meaningful and interpretable representations of the words, topics and documents. |
format | Online Article Text |
id | pubmed-7148210 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
record_format | MEDLINE/PubMed |
spelling | pubmed-71482102020-04-13 Inductive Document Network Embedding with Topic-Word Attention Brochier, Robin Guille, Adrien Velcin, Julien Advances in Information Retrieval Article Document network embedding aims at learning representations for a structured text corpus i.e. when documents are linked to each other. Recent algorithms extend network embedding approaches by incorporating the text content associated with the nodes in their formulations. In most cases, it is hard to interpret the learned representations. Moreover, little importance is given to the generalization to new documents that are not observed within the network. In this paper, we propose an interpretable and inductive document network embedding method. We introduce a novel mechanism, the Topic-Word Attention (TWA), that generates document representations based on the interplay between word and topic representations. We train these word and topic vectors through our general model, Inductive Document Network Embedding (IDNE), by leveraging the connections in the document network. Quantitative evaluations show that our approach achieves state-of-the-art performance on various networks and we qualitatively show that our model produces meaningful and interpretable representations of the words, topics and documents. 2020-03-17 /pmc/articles/PMC7148210/ http://dx.doi.org/10.1007/978-3-030-45439-5_22 Text en © Springer Nature Switzerland AG 2020 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic. |
spellingShingle | Article Brochier, Robin Guille, Adrien Velcin, Julien Inductive Document Network Embedding with Topic-Word Attention |
title | Inductive Document Network Embedding with Topic-Word Attention |
title_full | Inductive Document Network Embedding with Topic-Word Attention |
title_fullStr | Inductive Document Network Embedding with Topic-Word Attention |
title_full_unstemmed | Inductive Document Network Embedding with Topic-Word Attention |
title_short | Inductive Document Network Embedding with Topic-Word Attention |
title_sort | inductive document network embedding with topic-word attention |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7148210/ http://dx.doi.org/10.1007/978-3-030-45439-5_22 |
work_keys_str_mv | AT brochierrobin inductivedocumentnetworkembeddingwithtopicwordattention AT guilleadrien inductivedocumentnetworkembeddingwithtopicwordattention AT velcinjulien inductivedocumentnetworkembeddingwithtopicwordattention |