Cargando…

Spatial Opinion Mining from COVID-19 Twitter Data

PURPOSE: In the first quarter of 2020, World Health Organization (WHO) declared COVID-19 as a public health emergency around the globe. Therefore, different users from all over the world shared their thoughts about COVID-19 on social media platforms i.e., Twitter, Facebook etc. So, it is important t...

Descripción completa

Detalles Bibliográficos
Autores principales:	Syed, M.A., Decoupes, R., Arsevska, E., Roche, M., Teisseire, M.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Published by Elsevier Ltd. 2022
Materias:	Ps04.09 (549)
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8884810/ http://dx.doi.org/10.1016/j.ijid.2021.12.065

_version_	1784660248445845504
author	Syed, M.A. Decoupes, R. Arsevska, E. Roche, M. Teisseire, M.
author_facet	Syed, M.A. Decoupes, R. Arsevska, E. Roche, M. Teisseire, M.
author_sort	Syed, M.A.
collection	PubMed
description	PURPOSE: In the first quarter of 2020, World Health Organization (WHO) declared COVID-19 as a public health emergency around the globe. Therefore, different users from all over the world shared their thoughts about COVID-19 on social media platforms i.e., Twitter, Facebook etc. So, it is important to analyze public opinions about COVID-19 from different regions over different period of time. To fulfill the spatial analysis issue, a previous work called H-TF-IDF (Hierarchy-based measure for tweet analysis) for term extraction from tweet data has been proposed. In this work, we focus on the sentiment analysis performed on terms selected by H-TF-IDF for spatial tweets groups to know local situations during the ongoing epidemic COVID-19 over different time frames. METHODS & MATERIALS: The primary step is to extract terms from tweets using H-TF-IDF approach. Moreover, these terms are utilized in two ways i.e., 1) select tweets containing terms, 2) terms used as features for sentiment analysis. Thereafter, data preprocessing is performed to clean the text. Afterwards, Vectorization models i.e., bag-of-words (BOW) and term frequency-inverse document frequency (TF-IDF) are used to extract features with the help of n-gram techniques. These features are extracted to train the prediction models for sentiment analysis. Lastly, different statistical and machine learning models i.e., Logistic regression, support vector machine (SVM), etc. are applied to classify the spatial tweets groups. For preliminary results, experiments are conducted on H-TF-IDF tweets corpus having geocoded spatial information for the period of January, 2020. These tweets are extracted from the dataset collected by E.Chen (https://github.com/echen102/COVID-19-TweetIDs) that focuses on the early beginning of the outbreak. A uniform experiment setup of train-test (80% and 20%) split scheme is used for each prediction model. RESULTS: The results illustrate that specific terms highlighted by H-TF-IDF provide useful information that would not have been identified without this spatial analysis. The classification results spatial location tweet groups into positive, negative and neutral by subjectivity and polarity measures. CONCLUSION: The current work is applied on English language-based Twitter information. A following work is to incorporate other languages to perform sentiment analysis. Furthermore, BERT will be used to extend these features.
format	Online Article Text
id	pubmed-8884810
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Published by Elsevier Ltd.
record_format	MEDLINE/PubMed
spelling	pubmed-88848102022-03-01 Spatial Opinion Mining from COVID-19 Twitter Data Syed, M.A. Decoupes, R. Arsevska, E. Roche, M. Teisseire, M. Int J Infect Dis Ps04.09 (549) PURPOSE: In the first quarter of 2020, World Health Organization (WHO) declared COVID-19 as a public health emergency around the globe. Therefore, different users from all over the world shared their thoughts about COVID-19 on social media platforms i.e., Twitter, Facebook etc. So, it is important to analyze public opinions about COVID-19 from different regions over different period of time. To fulfill the spatial analysis issue, a previous work called H-TF-IDF (Hierarchy-based measure for tweet analysis) for term extraction from tweet data has been proposed. In this work, we focus on the sentiment analysis performed on terms selected by H-TF-IDF for spatial tweets groups to know local situations during the ongoing epidemic COVID-19 over different time frames. METHODS & MATERIALS: The primary step is to extract terms from tweets using H-TF-IDF approach. Moreover, these terms are utilized in two ways i.e., 1) select tweets containing terms, 2) terms used as features for sentiment analysis. Thereafter, data preprocessing is performed to clean the text. Afterwards, Vectorization models i.e., bag-of-words (BOW) and term frequency-inverse document frequency (TF-IDF) are used to extract features with the help of n-gram techniques. These features are extracted to train the prediction models for sentiment analysis. Lastly, different statistical and machine learning models i.e., Logistic regression, support vector machine (SVM), etc. are applied to classify the spatial tweets groups. For preliminary results, experiments are conducted on H-TF-IDF tweets corpus having geocoded spatial information for the period of January, 2020. These tweets are extracted from the dataset collected by E.Chen (https://github.com/echen102/COVID-19-TweetIDs) that focuses on the early beginning of the outbreak. A uniform experiment setup of train-test (80% and 20%) split scheme is used for each prediction model. RESULTS: The results illustrate that specific terms highlighted by H-TF-IDF provide useful information that would not have been identified without this spatial analysis. The classification results spatial location tweet groups into positive, negative and neutral by subjectivity and polarity measures. CONCLUSION: The current work is applied on English language-based Twitter information. A following work is to incorporate other languages to perform sentiment analysis. Furthermore, BERT will be used to extend these features. Published by Elsevier Ltd. 2022-03 2022-02-28 /pmc/articles/PMC8884810/ http://dx.doi.org/10.1016/j.ijid.2021.12.065 Text en Copyright © 2021 Published by Elsevier Ltd. Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active.
spellingShingle	Ps04.09 (549) Syed, M.A. Decoupes, R. Arsevska, E. Roche, M. Teisseire, M. Spatial Opinion Mining from COVID-19 Twitter Data
title	Spatial Opinion Mining from COVID-19 Twitter Data
title_full	Spatial Opinion Mining from COVID-19 Twitter Data
title_fullStr	Spatial Opinion Mining from COVID-19 Twitter Data
title_full_unstemmed	Spatial Opinion Mining from COVID-19 Twitter Data
title_short	Spatial Opinion Mining from COVID-19 Twitter Data
title_sort	spatial opinion mining from covid-19 twitter data
topic	Ps04.09 (549)
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8884810/ http://dx.doi.org/10.1016/j.ijid.2021.12.065
work_keys_str_mv	AT syedma spatialopinionminingfromcovid19twitterdata AT decoupesr spatialopinionminingfromcovid19twitterdata AT arsevskae spatialopinionminingfromcovid19twitterdata AT rochem spatialopinionminingfromcovid19twitterdata AT teisseirem spatialopinionminingfromcovid19twitterdata

Spatial Opinion Mining from COVID-19 Twitter Data

Ejemplares similares