Cargando…

Identifying informative tweets during a pandemic via a topic-aware neural language model

Every epidemic affects the real lives of many people around the world and leads to terrible consequences. Recently, many tweets about the COVID-19 pandemic have been shared publicly on social media platforms. The analysis of these tweets is helpful for emergency response organizations to prioritize...

Descripción completa

Detalles Bibliográficos
Autores principales: Gao, Wang, Li, Lin, Tao, Xiaohui, Zhou, Jing, Tao, Jun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer US 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8924578/
https://www.ncbi.nlm.nih.gov/pubmed/35308294
http://dx.doi.org/10.1007/s11280-022-01034-1
_version_ 1784669888708608000
author Gao, Wang
Li, Lin
Tao, Xiaohui
Zhou, Jing
Tao, Jun
author_facet Gao, Wang
Li, Lin
Tao, Xiaohui
Zhou, Jing
Tao, Jun
author_sort Gao, Wang
collection PubMed
description Every epidemic affects the real lives of many people around the world and leads to terrible consequences. Recently, many tweets about the COVID-19 pandemic have been shared publicly on social media platforms. The analysis of these tweets is helpful for emergency response organizations to prioritize their tasks and make better decisions. However, most of these tweets are non-informative, which is a challenge for establishing an automated system to detect useful information in social media. Furthermore, existing methods ignore unlabeled data and topic background knowledge, which can provide additional semantic information. In this paper, we propose a novel Topic-Aware BERT (TABERT) model to solve the above challenges. TABERT first leverages a topic model to extract the latent topics of tweets. Secondly, a flexible framework is used to combine topic information with the output of BERT. Finally, we adopt adversarial training to achieve semi-supervised learning, and a large amount of unlabeled data can be used to improve inner representations of the model. Experimental results on the dataset of COVID-19 English tweets show that our model outperforms classic and state-of-the-art baselines.
format Online
Article
Text
id pubmed-8924578
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Springer US
record_format MEDLINE/PubMed
spelling pubmed-89245782022-03-16 Identifying informative tweets during a pandemic via a topic-aware neural language model Gao, Wang Li, Lin Tao, Xiaohui Zhou, Jing Tao, Jun World Wide Web Article Every epidemic affects the real lives of many people around the world and leads to terrible consequences. Recently, many tweets about the COVID-19 pandemic have been shared publicly on social media platforms. The analysis of these tweets is helpful for emergency response organizations to prioritize their tasks and make better decisions. However, most of these tweets are non-informative, which is a challenge for establishing an automated system to detect useful information in social media. Furthermore, existing methods ignore unlabeled data and topic background knowledge, which can provide additional semantic information. In this paper, we propose a novel Topic-Aware BERT (TABERT) model to solve the above challenges. TABERT first leverages a topic model to extract the latent topics of tweets. Secondly, a flexible framework is used to combine topic information with the output of BERT. Finally, we adopt adversarial training to achieve semi-supervised learning, and a large amount of unlabeled data can be used to improve inner representations of the model. Experimental results on the dataset of COVID-19 English tweets show that our model outperforms classic and state-of-the-art baselines. Springer US 2022-03-16 2023 /pmc/articles/PMC8924578/ /pubmed/35308294 http://dx.doi.org/10.1007/s11280-022-01034-1 Text en © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle Article
Gao, Wang
Li, Lin
Tao, Xiaohui
Zhou, Jing
Tao, Jun
Identifying informative tweets during a pandemic via a topic-aware neural language model
title Identifying informative tweets during a pandemic via a topic-aware neural language model
title_full Identifying informative tweets during a pandemic via a topic-aware neural language model
title_fullStr Identifying informative tweets during a pandemic via a topic-aware neural language model
title_full_unstemmed Identifying informative tweets during a pandemic via a topic-aware neural language model
title_short Identifying informative tweets during a pandemic via a topic-aware neural language model
title_sort identifying informative tweets during a pandemic via a topic-aware neural language model
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8924578/
https://www.ncbi.nlm.nih.gov/pubmed/35308294
http://dx.doi.org/10.1007/s11280-022-01034-1
work_keys_str_mv AT gaowang identifyinginformativetweetsduringapandemicviaatopicawareneurallanguagemodel
AT lilin identifyinginformativetweetsduringapandemicviaatopicawareneurallanguagemodel
AT taoxiaohui identifyinginformativetweetsduringapandemicviaatopicawareneurallanguagemodel
AT zhoujing identifyinginformativetweetsduringapandemicviaatopicawareneurallanguagemodel
AT taojun identifyinginformativetweetsduringapandemicviaatopicawareneurallanguagemodel