Cargando…

Monitoring Physical Activity Levels Using Twitter Data: Infodemiology Study

BACKGROUND: Social media technology such as Twitter allows users to share their thoughts, feelings, and opinions online. The growing body of social media data is becoming a central part of infodemiology research as these data can be combined with other public health datasets (eg, physical activity l...

Descripción completa

Detalles Bibliográficos
Autores principales: Liu, Sam, Chen, Brian, Kuo, Alex
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2019
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6682305/
https://www.ncbi.nlm.nih.gov/pubmed/31162126
http://dx.doi.org/10.2196/12394
_version_ 1783441871895265280
author Liu, Sam
Chen, Brian
Kuo, Alex
author_facet Liu, Sam
Chen, Brian
Kuo, Alex
author_sort Liu, Sam
collection PubMed
description BACKGROUND: Social media technology such as Twitter allows users to share their thoughts, feelings, and opinions online. The growing body of social media data is becoming a central part of infodemiology research as these data can be combined with other public health datasets (eg, physical activity levels) to provide real-time monitoring of psychological and behavior outcomes that inform health behaviors. Currently, it is unclear whether Twitter data can be used to monitor physical activity levels. OBJECTIVE: The aim of this study was to establish the feasibility of using Twitter data to monitor physical activity levels by assessing whether the frequency and sentiment of physical activity–related tweets were associated with physical activity levels across the United States. METHODS: Tweets were collected from Twitter’s application programming interface (API) between January 10, 2017 and January 2, 2018. We used Twitter's garden hose method of collecting tweets, which provided a random sample of approximately 1% of all tweets with location metadata falling within the United States. Geotagged tweets were filtered. A list of physical activity–related hashtags was collected and used to further classify these geolocated tweets. Twitter data were merged with physical activity data collected as part of the Behavioral Risk Factor Surveillance System. Multiple linear regression models were fit to assess the relationship between physical activity–related tweets and physical activity levels by county while controlling for population and socioeconomic status measures. RESULTS: During the study period, 442,959,789 unique tweets were collected, of which 64,005,336 (14.44%) were geotagged with latitude and longitude coordinates. Aggregated data were obtained for a total of 3138 counties in the United States. The mean county-level percentage of physically active individuals was 74.05% (SD 5.2) and 75.30% (SD 4.96) after adjusting for age. The model showed that the percentage of physical activity–related tweets was significantly associated with physical activity levels (beta=.11; SE 0.2; P<.001) and age-adjusted physical activity (beta=.10; SE 0.20; P<.001) on a county level while adjusting for both Gini index and education level. However, the overall explained variance of the model was low (R(2)=.11). The sentiment of the physical activity–related tweets was not a significant predictor of physical activity level and age-adjusted physical activity on a county level after including the Gini index and education level in the model (P>.05). CONCLUSIONS: Social media data may be a valuable tool for public health organizations to monitor physical activity levels, as it can overcome the time lag in the reporting of physical activity epidemiology data faced by traditional research methods (eg, surveys and observational studies). Consequently, this tool may have the potential to help public health organizations better mobilize and target physical activity interventions.
format Online
Article
Text
id pubmed-6682305
institution National Center for Biotechnology Information
language English
publishDate 2019
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-66823052019-08-19 Monitoring Physical Activity Levels Using Twitter Data: Infodemiology Study Liu, Sam Chen, Brian Kuo, Alex J Med Internet Res Original Paper BACKGROUND: Social media technology such as Twitter allows users to share their thoughts, feelings, and opinions online. The growing body of social media data is becoming a central part of infodemiology research as these data can be combined with other public health datasets (eg, physical activity levels) to provide real-time monitoring of psychological and behavior outcomes that inform health behaviors. Currently, it is unclear whether Twitter data can be used to monitor physical activity levels. OBJECTIVE: The aim of this study was to establish the feasibility of using Twitter data to monitor physical activity levels by assessing whether the frequency and sentiment of physical activity–related tweets were associated with physical activity levels across the United States. METHODS: Tweets were collected from Twitter’s application programming interface (API) between January 10, 2017 and January 2, 2018. We used Twitter's garden hose method of collecting tweets, which provided a random sample of approximately 1% of all tweets with location metadata falling within the United States. Geotagged tweets were filtered. A list of physical activity–related hashtags was collected and used to further classify these geolocated tweets. Twitter data were merged with physical activity data collected as part of the Behavioral Risk Factor Surveillance System. Multiple linear regression models were fit to assess the relationship between physical activity–related tweets and physical activity levels by county while controlling for population and socioeconomic status measures. RESULTS: During the study period, 442,959,789 unique tweets were collected, of which 64,005,336 (14.44%) were geotagged with latitude and longitude coordinates. Aggregated data were obtained for a total of 3138 counties in the United States. The mean county-level percentage of physically active individuals was 74.05% (SD 5.2) and 75.30% (SD 4.96) after adjusting for age. The model showed that the percentage of physical activity–related tweets was significantly associated with physical activity levels (beta=.11; SE 0.2; P<.001) and age-adjusted physical activity (beta=.10; SE 0.20; P<.001) on a county level while adjusting for both Gini index and education level. However, the overall explained variance of the model was low (R(2)=.11). The sentiment of the physical activity–related tweets was not a significant predictor of physical activity level and age-adjusted physical activity on a county level after including the Gini index and education level in the model (P>.05). CONCLUSIONS: Social media data may be a valuable tool for public health organizations to monitor physical activity levels, as it can overcome the time lag in the reporting of physical activity epidemiology data faced by traditional research methods (eg, surveys and observational studies). Consequently, this tool may have the potential to help public health organizations better mobilize and target physical activity interventions. JMIR Publications 2019-06-03 /pmc/articles/PMC6682305/ /pubmed/31162126 http://dx.doi.org/10.2196/12394 Text en ©Sam Liu, Brian Chen, Alex Kuo. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 03.06.2019. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
Liu, Sam
Chen, Brian
Kuo, Alex
Monitoring Physical Activity Levels Using Twitter Data: Infodemiology Study
title Monitoring Physical Activity Levels Using Twitter Data: Infodemiology Study
title_full Monitoring Physical Activity Levels Using Twitter Data: Infodemiology Study
title_fullStr Monitoring Physical Activity Levels Using Twitter Data: Infodemiology Study
title_full_unstemmed Monitoring Physical Activity Levels Using Twitter Data: Infodemiology Study
title_short Monitoring Physical Activity Levels Using Twitter Data: Infodemiology Study
title_sort monitoring physical activity levels using twitter data: infodemiology study
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6682305/
https://www.ncbi.nlm.nih.gov/pubmed/31162126
http://dx.doi.org/10.2196/12394
work_keys_str_mv AT liusam monitoringphysicalactivitylevelsusingtwitterdatainfodemiologystudy
AT chenbrian monitoringphysicalactivitylevelsusingtwitterdatainfodemiologystudy
AT kuoalex monitoringphysicalactivitylevelsusingtwitterdatainfodemiologystudy