Cargando…

Classification of Twitter Users Who Tweet About E-Cigarettes

BACKGROUND: Despite concerns about their health risks, e‑cigarettes have gained popularity in recent years. Concurrent with the recent increase in e‑cigarette use, social media sites such as Twitter have become a common platform for sharing information about e-cigarettes and to promote marketing of...

Descripción completa

Detalles Bibliográficos
Autores principales: Kim, Annice, Miano, Thomas, Chew, Robert, Eggers, Matthew, Nonnemaker, James
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5635233/
https://www.ncbi.nlm.nih.gov/pubmed/28951381
http://dx.doi.org/10.2196/publichealth.8060
_version_ 1783270245781209088
author Kim, Annice
Miano, Thomas
Chew, Robert
Eggers, Matthew
Nonnemaker, James
author_facet Kim, Annice
Miano, Thomas
Chew, Robert
Eggers, Matthew
Nonnemaker, James
author_sort Kim, Annice
collection PubMed
description BACKGROUND: Despite concerns about their health risks, e‑cigarettes have gained popularity in recent years. Concurrent with the recent increase in e‑cigarette use, social media sites such as Twitter have become a common platform for sharing information about e-cigarettes and to promote marketing of e‑cigarettes. Monitoring the trends in e‑cigarette–related social media activity requires timely assessment of the content of posts and the types of users generating the content. However, little is known about the diversity of the types of users responsible for generating e‑cigarette–related content on Twitter. OBJECTIVE: The aim of this study was to demonstrate a novel methodology for automatically classifying Twitter users who tweet about e‑cigarette–related topics into distinct categories. METHODS: We collected approximately 11.5 million e‑cigarette–related tweets posted between November 2014 and October 2016 and obtained a random sample of Twitter users who tweeted about e‑cigarettes. Trained human coders examined the handles’ profiles and manually categorized each as one of the following user types: individual (n=2168), vaper enthusiast (n=334), informed agency (n=622), marketer (n=752), and spammer (n=1021). Next, the Twitter metadata as well as a sample of tweets for each labeled user were gathered, and features that reflect users’ metadata and tweeting behavior were analyzed. Finally, multiple machine learning algorithms were tested to identify a model with the best performance in classifying user types. RESULTS: Using a classification model that included metadata and features associated with tweeting behavior, we were able to predict with relatively high accuracy five different types of Twitter users that tweet about e‑cigarettes (average F(1) score=83.3%). Accuracy varied by user type, with F(1) scores of individuals, informed agencies, marketers, spammers, and vaper enthusiasts being 91.1%, 84.4%, 81.2%, 79.5%, and 47.1%, respectively. Vaper enthusiasts were the most challenging user type to predict accurately and were commonly misclassified as marketers. The inclusion of additional tweet-derived features that capture tweeting behavior was found to significantly improve the model performance—an overall F(1) score gain of 10.6%—beyond metadata features alone. CONCLUSIONS: This study provides a method for classifying five different types of users who tweet about e‑cigarettes. Our model achieved high levels of classification performance for most groups, and examining the tweeting behavior was critical in improving the model performance. Results can help identify groups engaged in conversations about e‑cigarettes online to help inform public health surveillance, education, and regulatory efforts.
format Online
Article
Text
id pubmed-5635233
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-56352332017-10-20 Classification of Twitter Users Who Tweet About E-Cigarettes Kim, Annice Miano, Thomas Chew, Robert Eggers, Matthew Nonnemaker, James JMIR Public Health Surveill Original Paper BACKGROUND: Despite concerns about their health risks, e‑cigarettes have gained popularity in recent years. Concurrent with the recent increase in e‑cigarette use, social media sites such as Twitter have become a common platform for sharing information about e-cigarettes and to promote marketing of e‑cigarettes. Monitoring the trends in e‑cigarette–related social media activity requires timely assessment of the content of posts and the types of users generating the content. However, little is known about the diversity of the types of users responsible for generating e‑cigarette–related content on Twitter. OBJECTIVE: The aim of this study was to demonstrate a novel methodology for automatically classifying Twitter users who tweet about e‑cigarette–related topics into distinct categories. METHODS: We collected approximately 11.5 million e‑cigarette–related tweets posted between November 2014 and October 2016 and obtained a random sample of Twitter users who tweeted about e‑cigarettes. Trained human coders examined the handles’ profiles and manually categorized each as one of the following user types: individual (n=2168), vaper enthusiast (n=334), informed agency (n=622), marketer (n=752), and spammer (n=1021). Next, the Twitter metadata as well as a sample of tweets for each labeled user were gathered, and features that reflect users’ metadata and tweeting behavior were analyzed. Finally, multiple machine learning algorithms were tested to identify a model with the best performance in classifying user types. RESULTS: Using a classification model that included metadata and features associated with tweeting behavior, we were able to predict with relatively high accuracy five different types of Twitter users that tweet about e‑cigarettes (average F(1) score=83.3%). Accuracy varied by user type, with F(1) scores of individuals, informed agencies, marketers, spammers, and vaper enthusiasts being 91.1%, 84.4%, 81.2%, 79.5%, and 47.1%, respectively. Vaper enthusiasts were the most challenging user type to predict accurately and were commonly misclassified as marketers. The inclusion of additional tweet-derived features that capture tweeting behavior was found to significantly improve the model performance—an overall F(1) score gain of 10.6%—beyond metadata features alone. CONCLUSIONS: This study provides a method for classifying five different types of users who tweet about e‑cigarettes. Our model achieved high levels of classification performance for most groups, and examining the tweeting behavior was critical in improving the model performance. Results can help identify groups engaged in conversations about e‑cigarettes online to help inform public health surveillance, education, and regulatory efforts. JMIR Publications 2017-09-26 /pmc/articles/PMC5635233/ /pubmed/28951381 http://dx.doi.org/10.2196/publichealth.8060 Text en ©Annice Kim, Thomas Miano, Robert Chew, Matthew Eggers, James Nonnemaker. Originally published in JMIR Public Health and Surveillance (http://publichealth.jmir.org), 26.09.2017. https://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Public Health and Surveillance, is properly cited. The complete bibliographic information, a link to the original publication on http://publichealth.jmir.org, as well as this copyright and license information must be included.
spellingShingle Original Paper
Kim, Annice
Miano, Thomas
Chew, Robert
Eggers, Matthew
Nonnemaker, James
Classification of Twitter Users Who Tweet About E-Cigarettes
title Classification of Twitter Users Who Tweet About E-Cigarettes
title_full Classification of Twitter Users Who Tweet About E-Cigarettes
title_fullStr Classification of Twitter Users Who Tweet About E-Cigarettes
title_full_unstemmed Classification of Twitter Users Who Tweet About E-Cigarettes
title_short Classification of Twitter Users Who Tweet About E-Cigarettes
title_sort classification of twitter users who tweet about e-cigarettes
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5635233/
https://www.ncbi.nlm.nih.gov/pubmed/28951381
http://dx.doi.org/10.2196/publichealth.8060
work_keys_str_mv AT kimannice classificationoftwitteruserswhotweetaboutecigarettes
AT mianothomas classificationoftwitteruserswhotweetaboutecigarettes
AT chewrobert classificationoftwitteruserswhotweetaboutecigarettes
AT eggersmatthew classificationoftwitteruserswhotweetaboutecigarettes
AT nonnemakerjames classificationoftwitteruserswhotweetaboutecigarettes