Cargando…

Word embeddings and deep learning for location prediction: tracking Coronavirus from British and American tweets

With the propagation of the Coronavirus pandemic, current trends on determining its individual and societal impacts become increasingly important. Recent researches grant special attention to the Coronavirus social networks infodemic to study such impacts. For this aim, we think that applying a geol...

Descripción completa

Detalles Bibliográficos
Autores principales: Hasni, Sarra, Faiz, Sami
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer Vienna 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8315503/
https://www.ncbi.nlm.nih.gov/pubmed/34335992
http://dx.doi.org/10.1007/s13278-021-00777-5
_version_ 1783729731433136128
author Hasni, Sarra
Faiz, Sami
author_facet Hasni, Sarra
Faiz, Sami
author_sort Hasni, Sarra
collection PubMed
description With the propagation of the Coronavirus pandemic, current trends on determining its individual and societal impacts become increasingly important. Recent researches grant special attention to the Coronavirus social networks infodemic to study such impacts. For this aim, we think that applying a geolocation process is crucial before proceeding to the infodemic management. In fact, the spread of reported events and actualities on social networks makes the identification of infected areas or locations of the information owners more challenging especially at a state level. In this paper, we focus on linguistic features to encode regional variations from short and noisy texts such as tweets to track this disease. We pay particular attention to contextual information for a better encoding of these features. We refer to some neural network-based models to capture relationships between words according to their contexts. Being examples of these models, we evaluate some word embedding ones to determine the most effective features’ combination that has more spatial evidence. Then, we ensure a sequential modeling of words for a better understanding of contextual information using recurrent neural networks. Without defining restricted sets of local words in relation to the Coronavirus disease, our framework called DeepGeoloc demonstrates its ability to geolocate both tweets and twitterers. It also makes it possible to capture geosemantics of nonlocal words and to delimit the sparse use of local ones particularly in retweets and reported events. Compared to some baselines, DeepGeoloc achieved competitive results. It also proves its scalability to handle large amounts of data and to geolocate new tweets even those describing new topics in relation to this disease.
format Online
Article
Text
id pubmed-8315503
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Springer Vienna
record_format MEDLINE/PubMed
spelling pubmed-83155032021-07-28 Word embeddings and deep learning for location prediction: tracking Coronavirus from British and American tweets Hasni, Sarra Faiz, Sami Soc Netw Anal Min Original Article With the propagation of the Coronavirus pandemic, current trends on determining its individual and societal impacts become increasingly important. Recent researches grant special attention to the Coronavirus social networks infodemic to study such impacts. For this aim, we think that applying a geolocation process is crucial before proceeding to the infodemic management. In fact, the spread of reported events and actualities on social networks makes the identification of infected areas or locations of the information owners more challenging especially at a state level. In this paper, we focus on linguistic features to encode regional variations from short and noisy texts such as tweets to track this disease. We pay particular attention to contextual information for a better encoding of these features. We refer to some neural network-based models to capture relationships between words according to their contexts. Being examples of these models, we evaluate some word embedding ones to determine the most effective features’ combination that has more spatial evidence. Then, we ensure a sequential modeling of words for a better understanding of contextual information using recurrent neural networks. Without defining restricted sets of local words in relation to the Coronavirus disease, our framework called DeepGeoloc demonstrates its ability to geolocate both tweets and twitterers. It also makes it possible to capture geosemantics of nonlocal words and to delimit the sparse use of local ones particularly in retweets and reported events. Compared to some baselines, DeepGeoloc achieved competitive results. It also proves its scalability to handle large amounts of data and to geolocate new tweets even those describing new topics in relation to this disease. Springer Vienna 2021-07-27 2021 /pmc/articles/PMC8315503/ /pubmed/34335992 http://dx.doi.org/10.1007/s13278-021-00777-5 Text en © The Author(s), under exclusive licence to Springer-Verlag GmbH Austria, part of Springer Nature 2021 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle Original Article
Hasni, Sarra
Faiz, Sami
Word embeddings and deep learning for location prediction: tracking Coronavirus from British and American tweets
title Word embeddings and deep learning for location prediction: tracking Coronavirus from British and American tweets
title_full Word embeddings and deep learning for location prediction: tracking Coronavirus from British and American tweets
title_fullStr Word embeddings and deep learning for location prediction: tracking Coronavirus from British and American tweets
title_full_unstemmed Word embeddings and deep learning for location prediction: tracking Coronavirus from British and American tweets
title_short Word embeddings and deep learning for location prediction: tracking Coronavirus from British and American tweets
title_sort word embeddings and deep learning for location prediction: tracking coronavirus from british and american tweets
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8315503/
https://www.ncbi.nlm.nih.gov/pubmed/34335992
http://dx.doi.org/10.1007/s13278-021-00777-5
work_keys_str_mv AT hasnisarra wordembeddingsanddeeplearningforlocationpredictiontrackingcoronavirusfrombritishandamericantweets
AT faizsami wordembeddingsanddeeplearningforlocationpredictiontrackingcoronavirusfrombritishandamericantweets