Cargando…

How the world’s collective attention is being paid to a pandemic: COVID-19 related n-gram time series for 24 languages on Twitter

In confronting the global spread of the coronavirus disease COVID-19 pandemic we must have coordinated medical, operational, and political responses. In all efforts, data is crucial. Fundamentally, and in the possible absence of a vaccine for 12 to 18 months, we need universal, well-documented testi...

Descripción completa

Detalles Bibliográficos
Autores principales: Alshaabi, Thayer, Arnold, Michael V., Minot, Joshua R., Adams, Jane Lydia, Dewhurst, David Rushing, Reagan, Andrew J., Muhamad, Roby, Danforth, Christopher M., Dodds, Peter Sheridan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7787459/
https://www.ncbi.nlm.nih.gov/pubmed/33406101
http://dx.doi.org/10.1371/journal.pone.0244476
_version_ 1783632828962963456
author Alshaabi, Thayer
Arnold, Michael V.
Minot, Joshua R.
Adams, Jane Lydia
Dewhurst, David Rushing
Reagan, Andrew J.
Muhamad, Roby
Danforth, Christopher M.
Dodds, Peter Sheridan
author_facet Alshaabi, Thayer
Arnold, Michael V.
Minot, Joshua R.
Adams, Jane Lydia
Dewhurst, David Rushing
Reagan, Andrew J.
Muhamad, Roby
Danforth, Christopher M.
Dodds, Peter Sheridan
author_sort Alshaabi, Thayer
collection PubMed
description In confronting the global spread of the coronavirus disease COVID-19 pandemic we must have coordinated medical, operational, and political responses. In all efforts, data is crucial. Fundamentally, and in the possible absence of a vaccine for 12 to 18 months, we need universal, well-documented testing for both the presence of the disease as well as confirmed recovery through serological tests for antibodies, and we need to track major socioeconomic indices. But we also need auxiliary data of all kinds, including data related to how populations are talking about the unfolding pandemic through news and stories. To in part help on the social media side, we curate a set of 2000 day-scale time series of 1- and 2-grams across 24 languages on Twitter that are most ‘important’ for April 2020 with respect to April 2019. We determine importance through our allotaxonometric instrument, rank-turbulence divergence. We make some basic observations about some of the time series, including a comparison to numbers of confirmed deaths due to COVID-19 over time. We broadly observe across all languages a peak for the language-specific word for ‘virus’ in January 2020 followed by a decline through February and then a surge through March and April. The world’s collective attention dropped away while the virus spread out from China. We host the time series on Gitlab, updating them on a daily basis while relevant. Our main intent is for other researchers to use these time series to enhance whatever analyses that may be of use during the pandemic as well as for retrospective investigations.
format Online
Article
Text
id pubmed-7787459
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-77874592021-01-14 How the world’s collective attention is being paid to a pandemic: COVID-19 related n-gram time series for 24 languages on Twitter Alshaabi, Thayer Arnold, Michael V. Minot, Joshua R. Adams, Jane Lydia Dewhurst, David Rushing Reagan, Andrew J. Muhamad, Roby Danforth, Christopher M. Dodds, Peter Sheridan PLoS One Research Article In confronting the global spread of the coronavirus disease COVID-19 pandemic we must have coordinated medical, operational, and political responses. In all efforts, data is crucial. Fundamentally, and in the possible absence of a vaccine for 12 to 18 months, we need universal, well-documented testing for both the presence of the disease as well as confirmed recovery through serological tests for antibodies, and we need to track major socioeconomic indices. But we also need auxiliary data of all kinds, including data related to how populations are talking about the unfolding pandemic through news and stories. To in part help on the social media side, we curate a set of 2000 day-scale time series of 1- and 2-grams across 24 languages on Twitter that are most ‘important’ for April 2020 with respect to April 2019. We determine importance through our allotaxonometric instrument, rank-turbulence divergence. We make some basic observations about some of the time series, including a comparison to numbers of confirmed deaths due to COVID-19 over time. We broadly observe across all languages a peak for the language-specific word for ‘virus’ in January 2020 followed by a decline through February and then a surge through March and April. The world’s collective attention dropped away while the virus spread out from China. We host the time series on Gitlab, updating them on a daily basis while relevant. Our main intent is for other researchers to use these time series to enhance whatever analyses that may be of use during the pandemic as well as for retrospective investigations. Public Library of Science 2021-01-06 /pmc/articles/PMC7787459/ /pubmed/33406101 http://dx.doi.org/10.1371/journal.pone.0244476 Text en © 2021 Alshaabi et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Alshaabi, Thayer
Arnold, Michael V.
Minot, Joshua R.
Adams, Jane Lydia
Dewhurst, David Rushing
Reagan, Andrew J.
Muhamad, Roby
Danforth, Christopher M.
Dodds, Peter Sheridan
How the world’s collective attention is being paid to a pandemic: COVID-19 related n-gram time series for 24 languages on Twitter
title How the world’s collective attention is being paid to a pandemic: COVID-19 related n-gram time series for 24 languages on Twitter
title_full How the world’s collective attention is being paid to a pandemic: COVID-19 related n-gram time series for 24 languages on Twitter
title_fullStr How the world’s collective attention is being paid to a pandemic: COVID-19 related n-gram time series for 24 languages on Twitter
title_full_unstemmed How the world’s collective attention is being paid to a pandemic: COVID-19 related n-gram time series for 24 languages on Twitter
title_short How the world’s collective attention is being paid to a pandemic: COVID-19 related n-gram time series for 24 languages on Twitter
title_sort how the world’s collective attention is being paid to a pandemic: covid-19 related n-gram time series for 24 languages on twitter
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7787459/
https://www.ncbi.nlm.nih.gov/pubmed/33406101
http://dx.doi.org/10.1371/journal.pone.0244476
work_keys_str_mv AT alshaabithayer howtheworldscollectiveattentionisbeingpaidtoapandemiccovid19relatedngramtimeseriesfor24languagesontwitter
AT arnoldmichaelv howtheworldscollectiveattentionisbeingpaidtoapandemiccovid19relatedngramtimeseriesfor24languagesontwitter
AT minotjoshuar howtheworldscollectiveattentionisbeingpaidtoapandemiccovid19relatedngramtimeseriesfor24languagesontwitter
AT adamsjanelydia howtheworldscollectiveattentionisbeingpaidtoapandemiccovid19relatedngramtimeseriesfor24languagesontwitter
AT dewhurstdavidrushing howtheworldscollectiveattentionisbeingpaidtoapandemiccovid19relatedngramtimeseriesfor24languagesontwitter
AT reaganandrewj howtheworldscollectiveattentionisbeingpaidtoapandemiccovid19relatedngramtimeseriesfor24languagesontwitter
AT muhamadroby howtheworldscollectiveattentionisbeingpaidtoapandemiccovid19relatedngramtimeseriesfor24languagesontwitter
AT danforthchristopherm howtheworldscollectiveattentionisbeingpaidtoapandemiccovid19relatedngramtimeseriesfor24languagesontwitter
AT doddspetersheridan howtheworldscollectiveattentionisbeingpaidtoapandemiccovid19relatedngramtimeseriesfor24languagesontwitter