Cargando…
Tracking Social Media Discourse About the COVID-19 Pandemic: Development of a Public Coronavirus Twitter Data Set
BACKGROUND: At the time of this writing, the coronavirus disease (COVID-19) pandemic outbreak has already put tremendous strain on many countries' citizens, resources, and economies around the world. Social distancing measures, travel bans, self-quarantines, and business closures are changing t...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
JMIR Publications
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7265654/ https://www.ncbi.nlm.nih.gov/pubmed/32427106 http://dx.doi.org/10.2196/19273 |
_version_ | 1783541169262690304 |
---|---|
author | Chen, Emily Lerman, Kristina Ferrara, Emilio |
author_facet | Chen, Emily Lerman, Kristina Ferrara, Emilio |
author_sort | Chen, Emily |
collection | PubMed |
description | BACKGROUND: At the time of this writing, the coronavirus disease (COVID-19) pandemic outbreak has already put tremendous strain on many countries' citizens, resources, and economies around the world. Social distancing measures, travel bans, self-quarantines, and business closures are changing the very fabric of societies worldwide. With people forced out of public spaces, much of the conversation about these phenomena now occurs online on social media platforms like Twitter. OBJECTIVE: In this paper, we describe a multilingual COVID-19 Twitter data set that we are making available to the research community via our COVID-19-TweetIDs GitHub repository. METHODS: We started this ongoing data collection on January 28, 2020, leveraging Twitter’s streaming application programming interface (API) and Tweepy to follow certain keywords and accounts that were trending at the time data collection began. We used Twitter’s search API to query for past tweets, resulting in the earliest tweets in our collection dating back to January 21, 2020. RESULTS: Since the inception of our collection, we have actively maintained and updated our GitHub repository on a weekly basis. We have published over 123 million tweets, with over 60% of the tweets in English. This paper also presents basic statistics that show that Twitter activity responds and reacts to COVID-19-related events. CONCLUSIONS: It is our hope that our contribution will enable the study of online conversation dynamics in the context of a planetary-scale epidemic outbreak of unprecedented proportions and implications. This data set could also help track COVID-19-related misinformation and unverified rumors or enable the understanding of fear and panic—and undoubtedly more. |
format | Online Article Text |
id | pubmed-7265654 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | JMIR Publications |
record_format | MEDLINE/PubMed |
spelling | pubmed-72656542020-06-05 Tracking Social Media Discourse About the COVID-19 Pandemic: Development of a Public Coronavirus Twitter Data Set Chen, Emily Lerman, Kristina Ferrara, Emilio JMIR Public Health Surveill Original Paper BACKGROUND: At the time of this writing, the coronavirus disease (COVID-19) pandemic outbreak has already put tremendous strain on many countries' citizens, resources, and economies around the world. Social distancing measures, travel bans, self-quarantines, and business closures are changing the very fabric of societies worldwide. With people forced out of public spaces, much of the conversation about these phenomena now occurs online on social media platforms like Twitter. OBJECTIVE: In this paper, we describe a multilingual COVID-19 Twitter data set that we are making available to the research community via our COVID-19-TweetIDs GitHub repository. METHODS: We started this ongoing data collection on January 28, 2020, leveraging Twitter’s streaming application programming interface (API) and Tweepy to follow certain keywords and accounts that were trending at the time data collection began. We used Twitter’s search API to query for past tweets, resulting in the earliest tweets in our collection dating back to January 21, 2020. RESULTS: Since the inception of our collection, we have actively maintained and updated our GitHub repository on a weekly basis. We have published over 123 million tweets, with over 60% of the tweets in English. This paper also presents basic statistics that show that Twitter activity responds and reacts to COVID-19-related events. CONCLUSIONS: It is our hope that our contribution will enable the study of online conversation dynamics in the context of a planetary-scale epidemic outbreak of unprecedented proportions and implications. This data set could also help track COVID-19-related misinformation and unverified rumors or enable the understanding of fear and panic—and undoubtedly more. JMIR Publications 2020-05-29 /pmc/articles/PMC7265654/ /pubmed/32427106 http://dx.doi.org/10.2196/19273 Text en ©Emily Chen, Kristina Lerman, Emilio Ferrara. Originally published in JMIR Public Health and Surveillance (http://publichealth.jmir.org), 29.05.2020. https://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Public Health and Surveillance, is properly cited. The complete bibliographic information, a link to the original publication on http://publichealth.jmir.org, as well as this copyright and license information must be included. |
spellingShingle | Original Paper Chen, Emily Lerman, Kristina Ferrara, Emilio Tracking Social Media Discourse About the COVID-19 Pandemic: Development of a Public Coronavirus Twitter Data Set |
title | Tracking Social Media Discourse About the COVID-19 Pandemic: Development of a Public Coronavirus Twitter Data Set |
title_full | Tracking Social Media Discourse About the COVID-19 Pandemic: Development of a Public Coronavirus Twitter Data Set |
title_fullStr | Tracking Social Media Discourse About the COVID-19 Pandemic: Development of a Public Coronavirus Twitter Data Set |
title_full_unstemmed | Tracking Social Media Discourse About the COVID-19 Pandemic: Development of a Public Coronavirus Twitter Data Set |
title_short | Tracking Social Media Discourse About the COVID-19 Pandemic: Development of a Public Coronavirus Twitter Data Set |
title_sort | tracking social media discourse about the covid-19 pandemic: development of a public coronavirus twitter data set |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7265654/ https://www.ncbi.nlm.nih.gov/pubmed/32427106 http://dx.doi.org/10.2196/19273 |
work_keys_str_mv | AT chenemily trackingsocialmediadiscourseaboutthecovid19pandemicdevelopmentofapubliccoronavirustwitterdataset AT lermankristina trackingsocialmediadiscourseaboutthecovid19pandemicdevelopmentofapubliccoronavirustwitterdataset AT ferraraemilio trackingsocialmediadiscourseaboutthecovid19pandemicdevelopmentofapubliccoronavirustwitterdataset |