Cargando…

Design and analysis of a large-scale COVID-19 tweets dataset

As of July 17, 2020, more than thirteen million people have been diagnosed with the Novel Coronavirus (COVID-19), and half a million people have already lost their lives due to this infectious disease. The World Health Organization declared the COVID-19 outbreak as a pandemic on March 11, 2020. Sinc...

Descripción completa

Detalles Bibliográficos
Autor principal: Lamsal, Rabindra
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer US 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7646503/
https://www.ncbi.nlm.nih.gov/pubmed/34764561
http://dx.doi.org/10.1007/s10489-020-02029-z
_version_ 1783606803867631616
author Lamsal, Rabindra
author_facet Lamsal, Rabindra
author_sort Lamsal, Rabindra
collection PubMed
description As of July 17, 2020, more than thirteen million people have been diagnosed with the Novel Coronavirus (COVID-19), and half a million people have already lost their lives due to this infectious disease. The World Health Organization declared the COVID-19 outbreak as a pandemic on March 11, 2020. Since then, social media platforms have experienced an exponential rise in the content related to the pandemic. In the past, Twitter data have been observed to be indispensable in the extraction of situational awareness information relating to any crisis. This paper presents COV19Tweets Dataset (Lamsal 2020a), a large-scale Twitter dataset with more than 310 million COVID-19 specific English language tweets and their sentiment scores. The dataset’s geo version, the GeoCOV19Tweets Dataset (Lamsal 2020b), is also presented. The paper discusses the datasets’ design in detail, and the tweets in both the datasets are analyzed. The datasets are released publicly, anticipating that they would contribute to a better understanding of spatial and temporal dimensions of the public discourse related to the ongoing pandemic. As per the stats, the datasets (Lamsal 2020a, 2020b) have been accessed over 74.5k times, collectively.
format Online
Article
Text
id pubmed-7646503
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Springer US
record_format MEDLINE/PubMed
spelling pubmed-76465032020-11-06 Design and analysis of a large-scale COVID-19 tweets dataset Lamsal, Rabindra Appl Intell (Dordr) Article As of July 17, 2020, more than thirteen million people have been diagnosed with the Novel Coronavirus (COVID-19), and half a million people have already lost their lives due to this infectious disease. The World Health Organization declared the COVID-19 outbreak as a pandemic on March 11, 2020. Since then, social media platforms have experienced an exponential rise in the content related to the pandemic. In the past, Twitter data have been observed to be indispensable in the extraction of situational awareness information relating to any crisis. This paper presents COV19Tweets Dataset (Lamsal 2020a), a large-scale Twitter dataset with more than 310 million COVID-19 specific English language tweets and their sentiment scores. The dataset’s geo version, the GeoCOV19Tweets Dataset (Lamsal 2020b), is also presented. The paper discusses the datasets’ design in detail, and the tweets in both the datasets are analyzed. The datasets are released publicly, anticipating that they would contribute to a better understanding of spatial and temporal dimensions of the public discourse related to the ongoing pandemic. As per the stats, the datasets (Lamsal 2020a, 2020b) have been accessed over 74.5k times, collectively. Springer US 2020-11-06 2021 /pmc/articles/PMC7646503/ /pubmed/34764561 http://dx.doi.org/10.1007/s10489-020-02029-z Text en © Springer Science+Business Media, LLC, part of Springer Nature 2020 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle Article
Lamsal, Rabindra
Design and analysis of a large-scale COVID-19 tweets dataset
title Design and analysis of a large-scale COVID-19 tweets dataset
title_full Design and analysis of a large-scale COVID-19 tweets dataset
title_fullStr Design and analysis of a large-scale COVID-19 tweets dataset
title_full_unstemmed Design and analysis of a large-scale COVID-19 tweets dataset
title_short Design and analysis of a large-scale COVID-19 tweets dataset
title_sort design and analysis of a large-scale covid-19 tweets dataset
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7646503/
https://www.ncbi.nlm.nih.gov/pubmed/34764561
http://dx.doi.org/10.1007/s10489-020-02029-z
work_keys_str_mv AT lamsalrabindra designandanalysisofalargescalecovid19tweetsdataset