Cargando…

Detecting Hotspot Information Using Multi-Attribute Based Topic Model

Microblogging as a kind of social network has become more and more important in our daily lives. Enormous amounts of information are produced and shared on a daily basis. Detecting hot topics in the mountains of information can help people get to the essential information more quickly. However, due...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Jing, Li, Li, Tan, Feng, Zhu, Ying, Feng, Weisi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4619720/
https://www.ncbi.nlm.nih.gov/pubmed/26496635
http://dx.doi.org/10.1371/journal.pone.0140539
_version_ 1782397167731736576
author Wang, Jing
Li, Li
Tan, Feng
Zhu, Ying
Feng, Weisi
author_facet Wang, Jing
Li, Li
Tan, Feng
Zhu, Ying
Feng, Weisi
author_sort Wang, Jing
collection PubMed
description Microblogging as a kind of social network has become more and more important in our daily lives. Enormous amounts of information are produced and shared on a daily basis. Detecting hot topics in the mountains of information can help people get to the essential information more quickly. However, due to short and sparse features, a large number of meaningless tweets and other characteristics of microblogs, traditional topic detection methods are often ineffective in detecting hot topics. In this paper, we propose a new topic model named multi-attribute latent dirichlet allocation (MA-LDA), in which the time and hashtag attributes of microblogs are incorporated into LDA model. By introducing time attribute, MA-LDA model can decide whether a word should appear in hot topics or not. Meanwhile, compared with the traditional LDA model, applying hashtag attribute in MA-LDA model gives the core words an artificially high ranking in results meaning the expressiveness of outcomes can be improved. Empirical evaluations on real data sets demonstrate that our method is able to detect hot topics more accurately and efficiently compared with several baselines. Our method provides strong evidence of the importance of the temporal factor in extracting hot topics.
format Online
Article
Text
id pubmed-4619720
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-46197202015-10-29 Detecting Hotspot Information Using Multi-Attribute Based Topic Model Wang, Jing Li, Li Tan, Feng Zhu, Ying Feng, Weisi PLoS One Research Article Microblogging as a kind of social network has become more and more important in our daily lives. Enormous amounts of information are produced and shared on a daily basis. Detecting hot topics in the mountains of information can help people get to the essential information more quickly. However, due to short and sparse features, a large number of meaningless tweets and other characteristics of microblogs, traditional topic detection methods are often ineffective in detecting hot topics. In this paper, we propose a new topic model named multi-attribute latent dirichlet allocation (MA-LDA), in which the time and hashtag attributes of microblogs are incorporated into LDA model. By introducing time attribute, MA-LDA model can decide whether a word should appear in hot topics or not. Meanwhile, compared with the traditional LDA model, applying hashtag attribute in MA-LDA model gives the core words an artificially high ranking in results meaning the expressiveness of outcomes can be improved. Empirical evaluations on real data sets demonstrate that our method is able to detect hot topics more accurately and efficiently compared with several baselines. Our method provides strong evidence of the importance of the temporal factor in extracting hot topics. Public Library of Science 2015-10-23 /pmc/articles/PMC4619720/ /pubmed/26496635 http://dx.doi.org/10.1371/journal.pone.0140539 Text en © 2015 Wang et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Wang, Jing
Li, Li
Tan, Feng
Zhu, Ying
Feng, Weisi
Detecting Hotspot Information Using Multi-Attribute Based Topic Model
title Detecting Hotspot Information Using Multi-Attribute Based Topic Model
title_full Detecting Hotspot Information Using Multi-Attribute Based Topic Model
title_fullStr Detecting Hotspot Information Using Multi-Attribute Based Topic Model
title_full_unstemmed Detecting Hotspot Information Using Multi-Attribute Based Topic Model
title_short Detecting Hotspot Information Using Multi-Attribute Based Topic Model
title_sort detecting hotspot information using multi-attribute based topic model
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4619720/
https://www.ncbi.nlm.nih.gov/pubmed/26496635
http://dx.doi.org/10.1371/journal.pone.0140539
work_keys_str_mv AT wangjing detectinghotspotinformationusingmultiattributebasedtopicmodel
AT lili detectinghotspotinformationusingmultiattributebasedtopicmodel
AT tanfeng detectinghotspotinformationusingmultiattributebasedtopicmodel
AT zhuying detectinghotspotinformationusingmultiattributebasedtopicmodel
AT fengweisi detectinghotspotinformationusingmultiattributebasedtopicmodel