Cargando…

Public Perception Analysis of Tweets During the 2015 Measles Outbreak: Comparative Study Using Convolutional Neural Network Models

BACKGROUND: Timely understanding of public perceptions allows public health agencies to provide up-to-date responses to health crises such as infectious diseases outbreaks. Social media such as Twitter provide an unprecedented way for the prompt assessment of the large-scale public response. OBJECTI...

Descripción completa

Detalles Bibliográficos
Autores principales:	Du, Jingcheng, Tang, Lu, Xiang, Yang, Zhi, Degui, Xu, Jun, Song, Hsing-Yi, Tao, Cui
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	JMIR Publications 2018
Materias:	Original Paper
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6056740/ https://www.ncbi.nlm.nih.gov/pubmed/29986843 http://dx.doi.org/10.2196/jmir.9413

_version_	1783341396234600448
author	Du, Jingcheng Tang, Lu Xiang, Yang Zhi, Degui Xu, Jun Song, Hsing-Yi Tao, Cui
author_facet	Du, Jingcheng Tang, Lu Xiang, Yang Zhi, Degui Xu, Jun Song, Hsing-Yi Tao, Cui
author_sort	Du, Jingcheng
collection	PubMed
description	BACKGROUND: Timely understanding of public perceptions allows public health agencies to provide up-to-date responses to health crises such as infectious diseases outbreaks. Social media such as Twitter provide an unprecedented way for the prompt assessment of the large-scale public response. OBJECTIVE: The aims of this study were to develop a scheme for a comprehensive public perception analysis of a measles outbreak based on Twitter data and demonstrate the superiority of the convolutional neural network (CNN) models (compared with conventional machine learning methods) on measles outbreak-related tweets classification tasks with a relatively small and highly unbalanced gold standard training set. METHODS: We first designed a comprehensive scheme for the analysis of public perception of measles based on tweets, including 3 dimensions: discussion themes, emotions expressed, and attitude toward vaccination. All 1,154,156 tweets containing the word “measles” posted between December 1, 2014, and April 30, 2015, were purchased and downloaded from DiscoverText.com. Two expert annotators curated a gold standard of 1151 tweets (approximately 0.1% of all tweets) based on the 3-dimensional scheme. Next, a tweet classification system based on the CNN framework was developed. We compared the performance of the CNN models to those of 4 conventional machine learning models and another neural network model. We also compared the impact of different word embeddings configurations for the CNN models: (1) Stanford GloVe embedding trained on billions of tweets in the general domain, (2) measles-specific embedding trained on our 1 million measles related tweets, and (3) a combination of the 2 embeddings. RESULTS: Cohen kappa intercoder reliability values for the annotation were: 0.78, 0.72, and 0.80 on the 3 dimensions, respectively. Class distributions within the gold standard were highly unbalanced for all dimensions. The CNN models performed better on all classification tasks than k-nearest neighbors, naïve Bayes, support vector machines, or random forest. Detailed comparison between support vector machines and the CNN models showed that the major contributor to the overall superiority of the CNN models is the improvement on recall, especially for classes with low occurrence. The CNN model with the 2 embedding combination led to better performance on discussion themes and emotions expressed (microaveraging F1 scores of 0.7811 and 0.8592, respectively), while the CNN model with Stanford embedding achieved best performance on attitude toward vaccination (microaveraging F1 score of 0.8642). CONCLUSIONS: The proposed scheme can successfully classify the public’s opinions and emotions in multiple dimensions, which would facilitate the timely understanding of public perceptions during the outbreak of an infectious disease. Compared with conventional machine learning methods, our CNN models showed superiority on measles-related tweet classification tasks with a relatively small and highly unbalanced gold standard. With the success of these tasks, our proposed scheme and CNN-based tweets classification system is expected to be useful for the analysis of tweets about other infectious diseases such as influenza and Ebola.
format	Online Article Text
id	pubmed-6056740
institution	National Center for Biotechnology Information
language	English
publishDate	2018
publisher	JMIR Publications
record_format	MEDLINE/PubMed
spelling	pubmed-60567402018-07-27 Public Perception Analysis of Tweets During the 2015 Measles Outbreak: Comparative Study Using Convolutional Neural Network Models Du, Jingcheng Tang, Lu Xiang, Yang Zhi, Degui Xu, Jun Song, Hsing-Yi Tao, Cui J Med Internet Res Original Paper BACKGROUND: Timely understanding of public perceptions allows public health agencies to provide up-to-date responses to health crises such as infectious diseases outbreaks. Social media such as Twitter provide an unprecedented way for the prompt assessment of the large-scale public response. OBJECTIVE: The aims of this study were to develop a scheme for a comprehensive public perception analysis of a measles outbreak based on Twitter data and demonstrate the superiority of the convolutional neural network (CNN) models (compared with conventional machine learning methods) on measles outbreak-related tweets classification tasks with a relatively small and highly unbalanced gold standard training set. METHODS: We first designed a comprehensive scheme for the analysis of public perception of measles based on tweets, including 3 dimensions: discussion themes, emotions expressed, and attitude toward vaccination. All 1,154,156 tweets containing the word “measles” posted between December 1, 2014, and April 30, 2015, were purchased and downloaded from DiscoverText.com. Two expert annotators curated a gold standard of 1151 tweets (approximately 0.1% of all tweets) based on the 3-dimensional scheme. Next, a tweet classification system based on the CNN framework was developed. We compared the performance of the CNN models to those of 4 conventional machine learning models and another neural network model. We also compared the impact of different word embeddings configurations for the CNN models: (1) Stanford GloVe embedding trained on billions of tweets in the general domain, (2) measles-specific embedding trained on our 1 million measles related tweets, and (3) a combination of the 2 embeddings. RESULTS: Cohen kappa intercoder reliability values for the annotation were: 0.78, 0.72, and 0.80 on the 3 dimensions, respectively. Class distributions within the gold standard were highly unbalanced for all dimensions. The CNN models performed better on all classification tasks than k-nearest neighbors, naïve Bayes, support vector machines, or random forest. Detailed comparison between support vector machines and the CNN models showed that the major contributor to the overall superiority of the CNN models is the improvement on recall, especially for classes with low occurrence. The CNN model with the 2 embedding combination led to better performance on discussion themes and emotions expressed (microaveraging F1 scores of 0.7811 and 0.8592, respectively), while the CNN model with Stanford embedding achieved best performance on attitude toward vaccination (microaveraging F1 score of 0.8642). CONCLUSIONS: The proposed scheme can successfully classify the public’s opinions and emotions in multiple dimensions, which would facilitate the timely understanding of public perceptions during the outbreak of an infectious disease. Compared with conventional machine learning methods, our CNN models showed superiority on measles-related tweet classification tasks with a relatively small and highly unbalanced gold standard. With the success of these tasks, our proposed scheme and CNN-based tweets classification system is expected to be useful for the analysis of tweets about other infectious diseases such as influenza and Ebola. JMIR Publications 2018-07-09 /pmc/articles/PMC6056740/ /pubmed/29986843 http://dx.doi.org/10.2196/jmir.9413 Text en ©Jingcheng Du, Lu Tang, Yang Xiang, Degui Zhi, Jun Xu, Hsing-Yi Song, Cui Tao. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 09.07.2018. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.
spellingShingle	Original Paper Du, Jingcheng Tang, Lu Xiang, Yang Zhi, Degui Xu, Jun Song, Hsing-Yi Tao, Cui Public Perception Analysis of Tweets During the 2015 Measles Outbreak: Comparative Study Using Convolutional Neural Network Models
title	Public Perception Analysis of Tweets During the 2015 Measles Outbreak: Comparative Study Using Convolutional Neural Network Models
title_full	Public Perception Analysis of Tweets During the 2015 Measles Outbreak: Comparative Study Using Convolutional Neural Network Models
title_fullStr	Public Perception Analysis of Tweets During the 2015 Measles Outbreak: Comparative Study Using Convolutional Neural Network Models
title_full_unstemmed	Public Perception Analysis of Tweets During the 2015 Measles Outbreak: Comparative Study Using Convolutional Neural Network Models
title_short	Public Perception Analysis of Tweets During the 2015 Measles Outbreak: Comparative Study Using Convolutional Neural Network Models
title_sort	public perception analysis of tweets during the 2015 measles outbreak: comparative study using convolutional neural network models
topic	Original Paper
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6056740/ https://www.ncbi.nlm.nih.gov/pubmed/29986843 http://dx.doi.org/10.2196/jmir.9413
work_keys_str_mv	AT dujingcheng publicperceptionanalysisoftweetsduringthe2015measlesoutbreakcomparativestudyusingconvolutionalneuralnetworkmodels AT tanglu publicperceptionanalysisoftweetsduringthe2015measlesoutbreakcomparativestudyusingconvolutionalneuralnetworkmodels AT xiangyang publicperceptionanalysisoftweetsduringthe2015measlesoutbreakcomparativestudyusingconvolutionalneuralnetworkmodels AT zhidegui publicperceptionanalysisoftweetsduringthe2015measlesoutbreakcomparativestudyusingconvolutionalneuralnetworkmodels AT xujun publicperceptionanalysisoftweetsduringthe2015measlesoutbreakcomparativestudyusingconvolutionalneuralnetworkmodels AT songhsingyi publicperceptionanalysisoftweetsduringthe2015measlesoutbreakcomparativestudyusingconvolutionalneuralnetworkmodels AT taocui publicperceptionanalysisoftweetsduringthe2015measlesoutbreakcomparativestudyusingconvolutionalneuralnetworkmodels

Public Perception Analysis of Tweets During the 2015 Measles Outbreak: Comparative Study Using Convolutional Neural Network Models

Ejemplares similares