Cargando…

Crisis social media data labeled for storm-related information and toponym usage

Social media provides citizens and officials with important sources of information during times of crisis. This data article makes available labeled, storm-related social media data collected over a six-hour period during a severe storm and F1 tornado that struck Central Pennsylvania on May 1(st), 2...

Descripción completa

Detalles Bibliográficos
Autor principal: Grace, Rob
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7200777/
https://www.ncbi.nlm.nih.gov/pubmed/32382607
http://dx.doi.org/10.1016/j.dib.2020.105595
_version_ 1783529408852656128
author Grace, Rob
author_facet Grace, Rob
author_sort Grace, Rob
collection PubMed
description Social media provides citizens and officials with important sources of information during times of crisis. This data article makes available labeled, storm-related social media data collected over a six-hour period during a severe storm and F1 tornado that struck Central Pennsylvania on May 1(st), 2017. Three datasets were collected from Twitter using location, keyword, and network filtering techniques, respectively. Only 2% of the 22,706 total tweets overlap among the datasets, providing researchers with a broader scope of information than normally available when collecting tweets using location (i.e., geotag-based) and keyword filtering alone or in combination during a crisis. Each data collection technique is described in detail, including network filtering which collects data from networks of social media users associated with a geographic area. The datasets are manually labeled for information content and toponym usage. The 22,706 tweet IDs, dehydrated for privacy, are labeled for relevance (storm-related and off-topic) and 19 types of storm-related information organized into six categories: infrastructure damage, service disruption, personal experience, weather updates, weather forecasts, and weather warnings. Data are also labeled for toponym usage (with or without toponyms), location (local, remote, and generic toponyms), and granularity (hyperlocal, municipal, and regional toponyms). The comprehensively labeled datasets provide researchers with opportunities to analyze crisis-related information behaviors and volunteered location information behaviors during a hyperlocal crisis event, as well as develop and evaluate automated filtering, geolocation, and event detection techniques that can aid citizens and crisis responders.
format Online
Article
Text
id pubmed-7200777
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-72007772020-05-07 Crisis social media data labeled for storm-related information and toponym usage Grace, Rob Data Brief Computer Science Social media provides citizens and officials with important sources of information during times of crisis. This data article makes available labeled, storm-related social media data collected over a six-hour period during a severe storm and F1 tornado that struck Central Pennsylvania on May 1(st), 2017. Three datasets were collected from Twitter using location, keyword, and network filtering techniques, respectively. Only 2% of the 22,706 total tweets overlap among the datasets, providing researchers with a broader scope of information than normally available when collecting tweets using location (i.e., geotag-based) and keyword filtering alone or in combination during a crisis. Each data collection technique is described in detail, including network filtering which collects data from networks of social media users associated with a geographic area. The datasets are manually labeled for information content and toponym usage. The 22,706 tweet IDs, dehydrated for privacy, are labeled for relevance (storm-related and off-topic) and 19 types of storm-related information organized into six categories: infrastructure damage, service disruption, personal experience, weather updates, weather forecasts, and weather warnings. Data are also labeled for toponym usage (with or without toponyms), location (local, remote, and generic toponyms), and granularity (hyperlocal, municipal, and regional toponyms). The comprehensively labeled datasets provide researchers with opportunities to analyze crisis-related information behaviors and volunteered location information behaviors during a hyperlocal crisis event, as well as develop and evaluate automated filtering, geolocation, and event detection techniques that can aid citizens and crisis responders. Elsevier 2020-04-21 /pmc/articles/PMC7200777/ /pubmed/32382607 http://dx.doi.org/10.1016/j.dib.2020.105595 Text en © 2020 The Author(s) http://creativecommons.org/licenses/by/4.0/ This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle Computer Science
Grace, Rob
Crisis social media data labeled for storm-related information and toponym usage
title Crisis social media data labeled for storm-related information and toponym usage
title_full Crisis social media data labeled for storm-related information and toponym usage
title_fullStr Crisis social media data labeled for storm-related information and toponym usage
title_full_unstemmed Crisis social media data labeled for storm-related information and toponym usage
title_short Crisis social media data labeled for storm-related information and toponym usage
title_sort crisis social media data labeled for storm-related information and toponym usage
topic Computer Science
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7200777/
https://www.ncbi.nlm.nih.gov/pubmed/32382607
http://dx.doi.org/10.1016/j.dib.2020.105595
work_keys_str_mv AT gracerob crisissocialmediadatalabeledforstormrelatedinformationandtoponymusage