Cargando…

OMG U got flu? Analysis of shared health messages for bio-surveillance

BACKGROUND: Micro-blogging services such as Twitter offer the potential to crowdsource epidemics in real-time. However, Twitter posts (‘tweets’) are often ambiguous and reactive to media trends. In order to ground user messages in epidemic response we focused on tracking reports of self-protective b...

Descripción completa

Detalles Bibliográficos
Autores principales: Collier, Nigel, Son, Nguyen Truong, Nguyen, Ngoc Mai
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3239309/
https://www.ncbi.nlm.nih.gov/pubmed/22166368
http://dx.doi.org/10.1186/2041-1480-2-S5-S9
_version_ 1782219164853731328
author Collier, Nigel
Son, Nguyen Truong
Nguyen, Ngoc Mai
author_facet Collier, Nigel
Son, Nguyen Truong
Nguyen, Ngoc Mai
author_sort Collier, Nigel
collection PubMed
description BACKGROUND: Micro-blogging services such as Twitter offer the potential to crowdsource epidemics in real-time. However, Twitter posts (‘tweets’) are often ambiguous and reactive to media trends. In order to ground user messages in epidemic response we focused on tracking reports of self-protective behaviour such as avoiding public gatherings or increased sanitation as the basis for further risk analysis. RESULTS: We created guidelines for tagging self protective behaviour based on Jones and Salathé (2009)’s behaviour response survey. Applying the guidelines to a corpus of 5283 Twitter messages related to influenza like illness showed a high level of inter-annotator agreement (kappa 0.86). We employed supervised learning using unigrams, bigrams and regular expressions as features with two supervised classifiers (SVM and Naive Bayes) to classify tweets into 4 self-reported protective behaviour categories plus a self-reported diagnosis. In addition to classification performance we report moderately strong Spearman’s Rho correlation by comparing classifier output against WHO/NREVSS laboratory data for A(H1N1) in the USA during the 2009-2010 influenza season. CONCLUSIONS: The study adds to evidence supporting a high degree of correlation between pre-diagnostic social media signals and diagnostic influenza case data, pointing the way towards low cost sensor networks. We believe that the signals we have modelled may be applicable to a wide range of diseases.
format Online
Article
Text
id pubmed-3239309
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-32393092011-12-16 OMG U got flu? Analysis of shared health messages for bio-surveillance Collier, Nigel Son, Nguyen Truong Nguyen, Ngoc Mai J Biomed Semantics Research BACKGROUND: Micro-blogging services such as Twitter offer the potential to crowdsource epidemics in real-time. However, Twitter posts (‘tweets’) are often ambiguous and reactive to media trends. In order to ground user messages in epidemic response we focused on tracking reports of self-protective behaviour such as avoiding public gatherings or increased sanitation as the basis for further risk analysis. RESULTS: We created guidelines for tagging self protective behaviour based on Jones and Salathé (2009)’s behaviour response survey. Applying the guidelines to a corpus of 5283 Twitter messages related to influenza like illness showed a high level of inter-annotator agreement (kappa 0.86). We employed supervised learning using unigrams, bigrams and regular expressions as features with two supervised classifiers (SVM and Naive Bayes) to classify tweets into 4 self-reported protective behaviour categories plus a self-reported diagnosis. In addition to classification performance we report moderately strong Spearman’s Rho correlation by comparing classifier output against WHO/NREVSS laboratory data for A(H1N1) in the USA during the 2009-2010 influenza season. CONCLUSIONS: The study adds to evidence supporting a high degree of correlation between pre-diagnostic social media signals and diagnostic influenza case data, pointing the way towards low cost sensor networks. We believe that the signals we have modelled may be applicable to a wide range of diseases. BioMed Central 2011-10-06 /pmc/articles/PMC3239309/ /pubmed/22166368 http://dx.doi.org/10.1186/2041-1480-2-S5-S9 Text en Copyright ©2011 Collier et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Collier, Nigel
Son, Nguyen Truong
Nguyen, Ngoc Mai
OMG U got flu? Analysis of shared health messages for bio-surveillance
title OMG U got flu? Analysis of shared health messages for bio-surveillance
title_full OMG U got flu? Analysis of shared health messages for bio-surveillance
title_fullStr OMG U got flu? Analysis of shared health messages for bio-surveillance
title_full_unstemmed OMG U got flu? Analysis of shared health messages for bio-surveillance
title_short OMG U got flu? Analysis of shared health messages for bio-surveillance
title_sort omg u got flu? analysis of shared health messages for bio-surveillance
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3239309/
https://www.ncbi.nlm.nih.gov/pubmed/22166368
http://dx.doi.org/10.1186/2041-1480-2-S5-S9
work_keys_str_mv AT colliernigel omgugotfluanalysisofsharedhealthmessagesforbiosurveillance
AT sonnguyentruong omgugotfluanalysisofsharedhealthmessagesforbiosurveillance
AT nguyenngocmai omgugotfluanalysisofsharedhealthmessagesforbiosurveillance