Cargando…
Public Discussion of Anthrax on Twitter: Using Machine Learning to Identify Relevant Topics and Events
BACKGROUND: Social media allows researchers to study opinions and reactions to events in real time. One area needing more study is anthrax-related events. A computational framework that utilizes machine learning techniques was created to collect tweets discussing anthrax, further categorize them as...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
JMIR Publications
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8277308/ https://www.ncbi.nlm.nih.gov/pubmed/34142975 http://dx.doi.org/10.2196/27976 |
_version_ | 1783722045384687616 |
---|---|
author | Miller, Michele Romine, William Oroszi, Terry |
author_facet | Miller, Michele Romine, William Oroszi, Terry |
author_sort | Miller, Michele |
collection | PubMed |
description | BACKGROUND: Social media allows researchers to study opinions and reactions to events in real time. One area needing more study is anthrax-related events. A computational framework that utilizes machine learning techniques was created to collect tweets discussing anthrax, further categorize them as relevant by the month of data collection, and detect discussions on anthrax-related events. OBJECTIVE: The objective of this study was to detect discussions on anthrax-related events and to determine the relevance of the tweets and topics of discussion over 12 months of data collection. METHODS: This is an infoveillance study, using tweets in English containing the keyword “Anthrax” and “Bacillus anthracis”, collected from September 25, 2017, through August 15, 2018. Machine learning techniques were used to determine what people were tweeting about anthrax. Data over time was plotted to determine whether an event was detected (a 3-fold spike in tweets). A machine learning classifier was created to categorize tweets by relevance to anthrax. Relevant tweets by month were examined using a topic modeling approach to determine the topics of discussion over time and how these events influence that discussion. RESULTS: Over the 12 months of data collection, a total of 204,008 tweets were collected. Logistic regression analysis revealed the best performance for relevance (precision=0.81; recall=0.81; F(1)-score=0.80). In total, 26 topics were associated with anthrax-related events, tweets that were highly retweeted, natural outbreaks, and news stories. CONCLUSIONS: This study shows that tweets related to anthrax can be collected and analyzed over time to determine what people are discussing and to detect key anthrax-related events. Future studies are required to focus only on opinion tweets, use the methodology to study other terrorism events, or to monitor for terrorism threats. |
format | Online Article Text |
id | pubmed-8277308 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | JMIR Publications |
record_format | MEDLINE/PubMed |
spelling | pubmed-82773082021-07-26 Public Discussion of Anthrax on Twitter: Using Machine Learning to Identify Relevant Topics and Events Miller, Michele Romine, William Oroszi, Terry JMIR Public Health Surveill Original Paper BACKGROUND: Social media allows researchers to study opinions and reactions to events in real time. One area needing more study is anthrax-related events. A computational framework that utilizes machine learning techniques was created to collect tweets discussing anthrax, further categorize them as relevant by the month of data collection, and detect discussions on anthrax-related events. OBJECTIVE: The objective of this study was to detect discussions on anthrax-related events and to determine the relevance of the tweets and topics of discussion over 12 months of data collection. METHODS: This is an infoveillance study, using tweets in English containing the keyword “Anthrax” and “Bacillus anthracis”, collected from September 25, 2017, through August 15, 2018. Machine learning techniques were used to determine what people were tweeting about anthrax. Data over time was plotted to determine whether an event was detected (a 3-fold spike in tweets). A machine learning classifier was created to categorize tweets by relevance to anthrax. Relevant tweets by month were examined using a topic modeling approach to determine the topics of discussion over time and how these events influence that discussion. RESULTS: Over the 12 months of data collection, a total of 204,008 tweets were collected. Logistic regression analysis revealed the best performance for relevance (precision=0.81; recall=0.81; F(1)-score=0.80). In total, 26 topics were associated with anthrax-related events, tweets that were highly retweeted, natural outbreaks, and news stories. CONCLUSIONS: This study shows that tweets related to anthrax can be collected and analyzed over time to determine what people are discussing and to detect key anthrax-related events. Future studies are required to focus only on opinion tweets, use the methodology to study other terrorism events, or to monitor for terrorism threats. JMIR Publications 2021-06-18 /pmc/articles/PMC8277308/ /pubmed/34142975 http://dx.doi.org/10.2196/27976 Text en ©Michele Miller, William Romine, Terry Oroszi. Originally published in JMIR Public Health and Surveillance (https://publichealth.jmir.org), 18.06.2021. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Public Health and Surveillance, is properly cited. The complete bibliographic information, a link to the original publication on https://publichealth.jmir.org, as well as this copyright and license information must be included. |
spellingShingle | Original Paper Miller, Michele Romine, William Oroszi, Terry Public Discussion of Anthrax on Twitter: Using Machine Learning to Identify Relevant Topics and Events |
title | Public Discussion of Anthrax on Twitter: Using Machine Learning to Identify Relevant Topics and Events |
title_full | Public Discussion of Anthrax on Twitter: Using Machine Learning to Identify Relevant Topics and Events |
title_fullStr | Public Discussion of Anthrax on Twitter: Using Machine Learning to Identify Relevant Topics and Events |
title_full_unstemmed | Public Discussion of Anthrax on Twitter: Using Machine Learning to Identify Relevant Topics and Events |
title_short | Public Discussion of Anthrax on Twitter: Using Machine Learning to Identify Relevant Topics and Events |
title_sort | public discussion of anthrax on twitter: using machine learning to identify relevant topics and events |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8277308/ https://www.ncbi.nlm.nih.gov/pubmed/34142975 http://dx.doi.org/10.2196/27976 |
work_keys_str_mv | AT millermichele publicdiscussionofanthraxontwitterusingmachinelearningtoidentifyrelevanttopicsandevents AT rominewilliam publicdiscussionofanthraxontwitterusingmachinelearningtoidentifyrelevanttopicsandevents AT orosziterry publicdiscussionofanthraxontwitterusingmachinelearningtoidentifyrelevanttopicsandevents |