Cargando…
Word embeddings and deep learning for location prediction: tracking Coronavirus from British and American tweets
With the propagation of the Coronavirus pandemic, current trends on determining its individual and societal impacts become increasingly important. Recent researches grant special attention to the Coronavirus social networks infodemic to study such impacts. For this aim, we think that applying a geol...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer Vienna
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8315503/ https://www.ncbi.nlm.nih.gov/pubmed/34335992 http://dx.doi.org/10.1007/s13278-021-00777-5 |
_version_ | 1783729731433136128 |
---|---|
author | Hasni, Sarra Faiz, Sami |
author_facet | Hasni, Sarra Faiz, Sami |
author_sort | Hasni, Sarra |
collection | PubMed |
description | With the propagation of the Coronavirus pandemic, current trends on determining its individual and societal impacts become increasingly important. Recent researches grant special attention to the Coronavirus social networks infodemic to study such impacts. For this aim, we think that applying a geolocation process is crucial before proceeding to the infodemic management. In fact, the spread of reported events and actualities on social networks makes the identification of infected areas or locations of the information owners more challenging especially at a state level. In this paper, we focus on linguistic features to encode regional variations from short and noisy texts such as tweets to track this disease. We pay particular attention to contextual information for a better encoding of these features. We refer to some neural network-based models to capture relationships between words according to their contexts. Being examples of these models, we evaluate some word embedding ones to determine the most effective features’ combination that has more spatial evidence. Then, we ensure a sequential modeling of words for a better understanding of contextual information using recurrent neural networks. Without defining restricted sets of local words in relation to the Coronavirus disease, our framework called DeepGeoloc demonstrates its ability to geolocate both tweets and twitterers. It also makes it possible to capture geosemantics of nonlocal words and to delimit the sparse use of local ones particularly in retweets and reported events. Compared to some baselines, DeepGeoloc achieved competitive results. It also proves its scalability to handle large amounts of data and to geolocate new tweets even those describing new topics in relation to this disease. |
format | Online Article Text |
id | pubmed-8315503 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Springer Vienna |
record_format | MEDLINE/PubMed |
spelling | pubmed-83155032021-07-28 Word embeddings and deep learning for location prediction: tracking Coronavirus from British and American tweets Hasni, Sarra Faiz, Sami Soc Netw Anal Min Original Article With the propagation of the Coronavirus pandemic, current trends on determining its individual and societal impacts become increasingly important. Recent researches grant special attention to the Coronavirus social networks infodemic to study such impacts. For this aim, we think that applying a geolocation process is crucial before proceeding to the infodemic management. In fact, the spread of reported events and actualities on social networks makes the identification of infected areas or locations of the information owners more challenging especially at a state level. In this paper, we focus on linguistic features to encode regional variations from short and noisy texts such as tweets to track this disease. We pay particular attention to contextual information for a better encoding of these features. We refer to some neural network-based models to capture relationships between words according to their contexts. Being examples of these models, we evaluate some word embedding ones to determine the most effective features’ combination that has more spatial evidence. Then, we ensure a sequential modeling of words for a better understanding of contextual information using recurrent neural networks. Without defining restricted sets of local words in relation to the Coronavirus disease, our framework called DeepGeoloc demonstrates its ability to geolocate both tweets and twitterers. It also makes it possible to capture geosemantics of nonlocal words and to delimit the sparse use of local ones particularly in retweets and reported events. Compared to some baselines, DeepGeoloc achieved competitive results. It also proves its scalability to handle large amounts of data and to geolocate new tweets even those describing new topics in relation to this disease. Springer Vienna 2021-07-27 2021 /pmc/articles/PMC8315503/ /pubmed/34335992 http://dx.doi.org/10.1007/s13278-021-00777-5 Text en © The Author(s), under exclusive licence to Springer-Verlag GmbH Austria, part of Springer Nature 2021 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic. |
spellingShingle | Original Article Hasni, Sarra Faiz, Sami Word embeddings and deep learning for location prediction: tracking Coronavirus from British and American tweets |
title | Word embeddings and deep learning for location prediction: tracking Coronavirus from British and American tweets |
title_full | Word embeddings and deep learning for location prediction: tracking Coronavirus from British and American tweets |
title_fullStr | Word embeddings and deep learning for location prediction: tracking Coronavirus from British and American tweets |
title_full_unstemmed | Word embeddings and deep learning for location prediction: tracking Coronavirus from British and American tweets |
title_short | Word embeddings and deep learning for location prediction: tracking Coronavirus from British and American tweets |
title_sort | word embeddings and deep learning for location prediction: tracking coronavirus from british and american tweets |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8315503/ https://www.ncbi.nlm.nih.gov/pubmed/34335992 http://dx.doi.org/10.1007/s13278-021-00777-5 |
work_keys_str_mv | AT hasnisarra wordembeddingsanddeeplearningforlocationpredictiontrackingcoronavirusfrombritishandamericantweets AT faizsami wordembeddingsanddeeplearningforlocationpredictiontrackingcoronavirusfrombritishandamericantweets |