Cargando…
Identifying informative tweets during a pandemic via a topic-aware neural language model
Every epidemic affects the real lives of many people around the world and leads to terrible consequences. Recently, many tweets about the COVID-19 pandemic have been shared publicly on social media platforms. The analysis of these tweets is helpful for emergency response organizations to prioritize...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer US
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8924578/ https://www.ncbi.nlm.nih.gov/pubmed/35308294 http://dx.doi.org/10.1007/s11280-022-01034-1 |
_version_ | 1784669888708608000 |
---|---|
author | Gao, Wang Li, Lin Tao, Xiaohui Zhou, Jing Tao, Jun |
author_facet | Gao, Wang Li, Lin Tao, Xiaohui Zhou, Jing Tao, Jun |
author_sort | Gao, Wang |
collection | PubMed |
description | Every epidemic affects the real lives of many people around the world and leads to terrible consequences. Recently, many tweets about the COVID-19 pandemic have been shared publicly on social media platforms. The analysis of these tweets is helpful for emergency response organizations to prioritize their tasks and make better decisions. However, most of these tweets are non-informative, which is a challenge for establishing an automated system to detect useful information in social media. Furthermore, existing methods ignore unlabeled data and topic background knowledge, which can provide additional semantic information. In this paper, we propose a novel Topic-Aware BERT (TABERT) model to solve the above challenges. TABERT first leverages a topic model to extract the latent topics of tweets. Secondly, a flexible framework is used to combine topic information with the output of BERT. Finally, we adopt adversarial training to achieve semi-supervised learning, and a large amount of unlabeled data can be used to improve inner representations of the model. Experimental results on the dataset of COVID-19 English tweets show that our model outperforms classic and state-of-the-art baselines. |
format | Online Article Text |
id | pubmed-8924578 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Springer US |
record_format | MEDLINE/PubMed |
spelling | pubmed-89245782022-03-16 Identifying informative tweets during a pandemic via a topic-aware neural language model Gao, Wang Li, Lin Tao, Xiaohui Zhou, Jing Tao, Jun World Wide Web Article Every epidemic affects the real lives of many people around the world and leads to terrible consequences. Recently, many tweets about the COVID-19 pandemic have been shared publicly on social media platforms. The analysis of these tweets is helpful for emergency response organizations to prioritize their tasks and make better decisions. However, most of these tweets are non-informative, which is a challenge for establishing an automated system to detect useful information in social media. Furthermore, existing methods ignore unlabeled data and topic background knowledge, which can provide additional semantic information. In this paper, we propose a novel Topic-Aware BERT (TABERT) model to solve the above challenges. TABERT first leverages a topic model to extract the latent topics of tweets. Secondly, a flexible framework is used to combine topic information with the output of BERT. Finally, we adopt adversarial training to achieve semi-supervised learning, and a large amount of unlabeled data can be used to improve inner representations of the model. Experimental results on the dataset of COVID-19 English tweets show that our model outperforms classic and state-of-the-art baselines. Springer US 2022-03-16 2023 /pmc/articles/PMC8924578/ /pubmed/35308294 http://dx.doi.org/10.1007/s11280-022-01034-1 Text en © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic. |
spellingShingle | Article Gao, Wang Li, Lin Tao, Xiaohui Zhou, Jing Tao, Jun Identifying informative tweets during a pandemic via a topic-aware neural language model |
title | Identifying informative tweets during a pandemic via a topic-aware neural language model |
title_full | Identifying informative tweets during a pandemic via a topic-aware neural language model |
title_fullStr | Identifying informative tweets during a pandemic via a topic-aware neural language model |
title_full_unstemmed | Identifying informative tweets during a pandemic via a topic-aware neural language model |
title_short | Identifying informative tweets during a pandemic via a topic-aware neural language model |
title_sort | identifying informative tweets during a pandemic via a topic-aware neural language model |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8924578/ https://www.ncbi.nlm.nih.gov/pubmed/35308294 http://dx.doi.org/10.1007/s11280-022-01034-1 |
work_keys_str_mv | AT gaowang identifyinginformativetweetsduringapandemicviaatopicawareneurallanguagemodel AT lilin identifyinginformativetweetsduringapandemicviaatopicawareneurallanguagemodel AT taoxiaohui identifyinginformativetweetsduringapandemicviaatopicawareneurallanguagemodel AT zhoujing identifyinginformativetweetsduringapandemicviaatopicawareneurallanguagemodel AT taojun identifyinginformativetweetsduringapandemicviaatopicawareneurallanguagemodel |