Cargando…
Seed-Guided Deep Document Clustering
Different users may be interested in different clustering views underlying a given collection (e.g., topic and writing style in documents). Enabling them to provide constraints reflecting their needs can then help obtain tailored clustering results. For document clustering, constraints can be provid...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7148236/ http://dx.doi.org/10.1007/978-3-030-45439-5_1 |
_version_ | 1783520550286524416 |
---|---|
author | Fard, Mazar Moradi Thonet, Thibaut Gaussier, Eric |
author_facet | Fard, Mazar Moradi Thonet, Thibaut Gaussier, Eric |
author_sort | Fard, Mazar Moradi |
collection | PubMed |
description | Different users may be interested in different clustering views underlying a given collection (e.g., topic and writing style in documents). Enabling them to provide constraints reflecting their needs can then help obtain tailored clustering results. For document clustering, constraints can be provided in the form of seed words, each cluster being characterized by a small set of words. This seed-guided constrained document clustering problem was recently addressed through topic modeling approaches. In this paper, we jointly learn deep representations and bias the clustering results through the seed words, leading to a Seed-guided Deep Document Clustering approach. Its effectiveness is demonstrated on five public datasets. |
format | Online Article Text |
id | pubmed-7148236 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
record_format | MEDLINE/PubMed |
spelling | pubmed-71482362020-04-13 Seed-Guided Deep Document Clustering Fard, Mazar Moradi Thonet, Thibaut Gaussier, Eric Advances in Information Retrieval Article Different users may be interested in different clustering views underlying a given collection (e.g., topic and writing style in documents). Enabling them to provide constraints reflecting their needs can then help obtain tailored clustering results. For document clustering, constraints can be provided in the form of seed words, each cluster being characterized by a small set of words. This seed-guided constrained document clustering problem was recently addressed through topic modeling approaches. In this paper, we jointly learn deep representations and bias the clustering results through the seed words, leading to a Seed-guided Deep Document Clustering approach. Its effectiveness is demonstrated on five public datasets. 2020-03-17 /pmc/articles/PMC7148236/ http://dx.doi.org/10.1007/978-3-030-45439-5_1 Text en © Springer Nature Switzerland AG 2020 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic. |
spellingShingle | Article Fard, Mazar Moradi Thonet, Thibaut Gaussier, Eric Seed-Guided Deep Document Clustering |
title | Seed-Guided Deep Document Clustering |
title_full | Seed-Guided Deep Document Clustering |
title_fullStr | Seed-Guided Deep Document Clustering |
title_full_unstemmed | Seed-Guided Deep Document Clustering |
title_short | Seed-Guided Deep Document Clustering |
title_sort | seed-guided deep document clustering |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7148236/ http://dx.doi.org/10.1007/978-3-030-45439-5_1 |
work_keys_str_mv | AT fardmazarmoradi seedguideddeepdocumentclustering AT thonetthibaut seedguideddeepdocumentclustering AT gaussiereric seedguideddeepdocumentclustering |