Cargando…

ComStreamClust: a Communicative Multi-Agent Approach to Text Clustering in Streaming Data

Topic detection is the task of determining and tracking hot topics in social media. Twitter is arguably the most popular platform for people to share their ideas with others about different issues. One such prevalent issue is the COVID-19 pandemic. Detecting and tracking topics on these kinds of iss...

Descripción completa

Detalles Bibliográficos
Autores principales: Najafi, Ali, Gholipour-Shilabin, Araz, Dehkharghani, Rahim, Mohammadpur-Fard, Ali, Asgari-Chenaghlu, Meysam
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer Berlin Heidelberg 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9263071/
http://dx.doi.org/10.1007/s40745-022-00426-4
_version_ 1784742644080967680
author Najafi, Ali
Gholipour-Shilabin, Araz
Dehkharghani, Rahim
Mohammadpur-Fard, Ali
Asgari-Chenaghlu, Meysam
author_facet Najafi, Ali
Gholipour-Shilabin, Araz
Dehkharghani, Rahim
Mohammadpur-Fard, Ali
Asgari-Chenaghlu, Meysam
author_sort Najafi, Ali
collection PubMed
description Topic detection is the task of determining and tracking hot topics in social media. Twitter is arguably the most popular platform for people to share their ideas with others about different issues. One such prevalent issue is the COVID-19 pandemic. Detecting and tracking topics on these kinds of issues would help governments and healthcare companies deal with this phenomenon. In this paper, we propose a novel, multi-agent, communicative clustering approach, so-called ComStreamClust for clustering sub-topics inside a broader topic, e.g., the COVID-19 and the FA CUP. The proposed approach is parallelizable, and can simultaneously handle several data-point. The LaBSE sentence embedding is used to measure the semantic similarity between two tweets. ComStreamClust has been evaluated by several metrics such as keyword precision, keyword recall, and topic recall. Based on topic recall on different number of keywords, ComStreamClust obtains superior results when compared to the existing methods.
format Online
Article
Text
id pubmed-9263071
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Springer Berlin Heidelberg
record_format MEDLINE/PubMed
spelling pubmed-92630712022-07-08 ComStreamClust: a Communicative Multi-Agent Approach to Text Clustering in Streaming Data Najafi, Ali Gholipour-Shilabin, Araz Dehkharghani, Rahim Mohammadpur-Fard, Ali Asgari-Chenaghlu, Meysam Ann. Data. Sci. Article Topic detection is the task of determining and tracking hot topics in social media. Twitter is arguably the most popular platform for people to share their ideas with others about different issues. One such prevalent issue is the COVID-19 pandemic. Detecting and tracking topics on these kinds of issues would help governments and healthcare companies deal with this phenomenon. In this paper, we propose a novel, multi-agent, communicative clustering approach, so-called ComStreamClust for clustering sub-topics inside a broader topic, e.g., the COVID-19 and the FA CUP. The proposed approach is parallelizable, and can simultaneously handle several data-point. The LaBSE sentence embedding is used to measure the semantic similarity between two tweets. ComStreamClust has been evaluated by several metrics such as keyword precision, keyword recall, and topic recall. Based on topic recall on different number of keywords, ComStreamClust obtains superior results when compared to the existing methods. Springer Berlin Heidelberg 2022-07-08 /pmc/articles/PMC9263071/ http://dx.doi.org/10.1007/s40745-022-00426-4 Text en © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2022 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle Article
Najafi, Ali
Gholipour-Shilabin, Araz
Dehkharghani, Rahim
Mohammadpur-Fard, Ali
Asgari-Chenaghlu, Meysam
ComStreamClust: a Communicative Multi-Agent Approach to Text Clustering in Streaming Data
title ComStreamClust: a Communicative Multi-Agent Approach to Text Clustering in Streaming Data
title_full ComStreamClust: a Communicative Multi-Agent Approach to Text Clustering in Streaming Data
title_fullStr ComStreamClust: a Communicative Multi-Agent Approach to Text Clustering in Streaming Data
title_full_unstemmed ComStreamClust: a Communicative Multi-Agent Approach to Text Clustering in Streaming Data
title_short ComStreamClust: a Communicative Multi-Agent Approach to Text Clustering in Streaming Data
title_sort comstreamclust: a communicative multi-agent approach to text clustering in streaming data
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9263071/
http://dx.doi.org/10.1007/s40745-022-00426-4
work_keys_str_mv AT najafiali comstreamclustacommunicativemultiagentapproachtotextclusteringinstreamingdata
AT gholipourshilabinaraz comstreamclustacommunicativemultiagentapproachtotextclusteringinstreamingdata
AT dehkharghanirahim comstreamclustacommunicativemultiagentapproachtotextclusteringinstreamingdata
AT mohammadpurfardali comstreamclustacommunicativemultiagentapproachtotextclusteringinstreamingdata
AT asgarichenaghlumeysam comstreamclustacommunicativemultiagentapproachtotextclusteringinstreamingdata