Cargando…
Open Source Health Intelligence (OSHINT) for Foodborne Illness Event Characterization
OBJECTIVE: We propose a cloud-based Open Source Health Intelligence (OS-HINT) system that uses open source media outlets, such as Twitter and RSS feeds, to automatically characterize foodborne illness events in real-time. OSHINT also forecasts response requirements, through predictive models, to all...
Autores principales: | , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
University of Illinois at Chicago Library
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3692774/ |
_version_ | 1782274652215705600 |
---|---|
author | Ordun, Catherine Blake, Jane W. Rosidi, Nathanael Grigoryan, Vahan Reffett, Christopher Aslam, Sadia Gentilcore, Anastasia Cyran, Marek Shelton, Matthew Klenk, Juergen |
author_facet | Ordun, Catherine Blake, Jane W. Rosidi, Nathanael Grigoryan, Vahan Reffett, Christopher Aslam, Sadia Gentilcore, Anastasia Cyran, Marek Shelton, Matthew Klenk, Juergen |
author_sort | Ordun, Catherine |
collection | PubMed |
description | OBJECTIVE: We propose a cloud-based Open Source Health Intelligence (OS-HINT) system that uses open source media outlets, such as Twitter and RSS feeds, to automatically characterize foodborne illness events in real-time. OSHINT also forecasts response requirements, through predictive models, to allow more efficient use of resources, personnel, and countermeasures in biological event response. INTRODUCTION: An increasing amount of global discourse reporting has migrated to the online space, in the form of publicly accessible social media outlets, blogs, wikis, and news feeds. Social media also presents publicly available and highly accessible information about individual, real-time activity that can be leveraged to detect, monitor, and more efficiently respond to biological events. METHODS: Salmonella and Escherichia Coli (E. coli) events were selected based on the magnitude and number of reported outbreaks to the Centers for Disease Control (CDC) in the last ten years (1). These events affect multiple states and were large enough to ensure appropriate confidence levels when developing response metrics obtained from our prediction models. We collected social media data between 2006 – 2012 due to the emergence of Twitter, Facebook, and other social media utilization during this time period. Characterization is defined as the process of identifying specific event features that inform overall situational awareness. The number hospitalized, dead, or injured, in addition to patient demographics and symptoms were determined to be useful for our characterization and forecast event metrics. Analytical methods, such as term-frequency-inverse document frequency (TF-IDF), natural language processing (NLP), and information extraction, were used to characterize events according to our metrics. Lexicon development, during NLP implementation, was generated from online news articles used to describe the events. Lastly, forecasting algorithms were developed to predict the potential response based on similar historical events that were initially characterized by our information extraction algorithms. RESULTS: The OSHINT system was developed in Amazon Web Services and includes real-time social media collection for event characterization (see Figure 1). OSHINT currently characterizes number of victims ill, hospitalized, and dead due to foodborne illness events. OSHINT was used to characterize the recent national 2012 Salmonella event related to cantaloupes, during which OSHINT characterized social media posts related to the event, as news articles and Twitter tweets streamed into the system (Figure 2). On August 17, 2012 the OSHINT system identified a large increase in Twitter tweets mentioning salmonella. Social media data found absent (victims missing work or school day), death, hospital, and sick events to involve 2, 4, 17, 283 media mentions, respectively. Our TF-IDF algorithm characterized the salmonella event impact as two dead and 150 sickened by salmonella-tainted cantaloupe. Retrospective analysis of CDC reported data on August 30, 2012 indicated the salmonella event involved two deaths in 204 cases (2). CONCLUSIONS: The OSHINT team is continually developing and refining characterization and forecasting algorithms used in the system. Upon completion, OSHINT will characterize symptoms, geography, and demographics for E. coli and Salmonella events. The system will also forecast number sick, dead, and hospitalized for an effective and quick response. We will refine our algorithms and evaluate the system against past and future events to provide confidence in our results. |
format | Online Article Text |
id | pubmed-3692774 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | University of Illinois at Chicago Library |
record_format | MEDLINE/PubMed |
spelling | pubmed-36927742013-06-26 Open Source Health Intelligence (OSHINT) for Foodborne Illness Event Characterization Ordun, Catherine Blake, Jane W. Rosidi, Nathanael Grigoryan, Vahan Reffett, Christopher Aslam, Sadia Gentilcore, Anastasia Cyran, Marek Shelton, Matthew Klenk, Juergen Online J Public Health Inform ISDS 2012 Conference Abstracts OBJECTIVE: We propose a cloud-based Open Source Health Intelligence (OS-HINT) system that uses open source media outlets, such as Twitter and RSS feeds, to automatically characterize foodborne illness events in real-time. OSHINT also forecasts response requirements, through predictive models, to allow more efficient use of resources, personnel, and countermeasures in biological event response. INTRODUCTION: An increasing amount of global discourse reporting has migrated to the online space, in the form of publicly accessible social media outlets, blogs, wikis, and news feeds. Social media also presents publicly available and highly accessible information about individual, real-time activity that can be leveraged to detect, monitor, and more efficiently respond to biological events. METHODS: Salmonella and Escherichia Coli (E. coli) events were selected based on the magnitude and number of reported outbreaks to the Centers for Disease Control (CDC) in the last ten years (1). These events affect multiple states and were large enough to ensure appropriate confidence levels when developing response metrics obtained from our prediction models. We collected social media data between 2006 – 2012 due to the emergence of Twitter, Facebook, and other social media utilization during this time period. Characterization is defined as the process of identifying specific event features that inform overall situational awareness. The number hospitalized, dead, or injured, in addition to patient demographics and symptoms were determined to be useful for our characterization and forecast event metrics. Analytical methods, such as term-frequency-inverse document frequency (TF-IDF), natural language processing (NLP), and information extraction, were used to characterize events according to our metrics. Lexicon development, during NLP implementation, was generated from online news articles used to describe the events. Lastly, forecasting algorithms were developed to predict the potential response based on similar historical events that were initially characterized by our information extraction algorithms. RESULTS: The OSHINT system was developed in Amazon Web Services and includes real-time social media collection for event characterization (see Figure 1). OSHINT currently characterizes number of victims ill, hospitalized, and dead due to foodborne illness events. OSHINT was used to characterize the recent national 2012 Salmonella event related to cantaloupes, during which OSHINT characterized social media posts related to the event, as news articles and Twitter tweets streamed into the system (Figure 2). On August 17, 2012 the OSHINT system identified a large increase in Twitter tweets mentioning salmonella. Social media data found absent (victims missing work or school day), death, hospital, and sick events to involve 2, 4, 17, 283 media mentions, respectively. Our TF-IDF algorithm characterized the salmonella event impact as two dead and 150 sickened by salmonella-tainted cantaloupe. Retrospective analysis of CDC reported data on August 30, 2012 indicated the salmonella event involved two deaths in 204 cases (2). CONCLUSIONS: The OSHINT team is continually developing and refining characterization and forecasting algorithms used in the system. Upon completion, OSHINT will characterize symptoms, geography, and demographics for E. coli and Salmonella events. The system will also forecast number sick, dead, and hospitalized for an effective and quick response. We will refine our algorithms and evaluate the system against past and future events to provide confidence in our results. University of Illinois at Chicago Library 2013-04-04 /pmc/articles/PMC3692774/ Text en ©2013 the author(s) http://www.uic.edu/htbin/cgiwrap/bin/ojs/index.php/ojphi/about/submissions#copyrightNotice This is an Open Access article. Authors own copyright of their articles appearing in the Online Journal of Public Health Informatics. Readers may copy articles without permission of the copyright owner(s), as long as the author and OJPHI are acknowledged in the copy and the copy is used for educational, not-for-profit purposes. |
spellingShingle | ISDS 2012 Conference Abstracts Ordun, Catherine Blake, Jane W. Rosidi, Nathanael Grigoryan, Vahan Reffett, Christopher Aslam, Sadia Gentilcore, Anastasia Cyran, Marek Shelton, Matthew Klenk, Juergen Open Source Health Intelligence (OSHINT) for Foodborne Illness Event Characterization |
title | Open Source Health Intelligence (OSHINT) for Foodborne Illness Event Characterization |
title_full | Open Source Health Intelligence (OSHINT) for Foodborne Illness Event Characterization |
title_fullStr | Open Source Health Intelligence (OSHINT) for Foodborne Illness Event Characterization |
title_full_unstemmed | Open Source Health Intelligence (OSHINT) for Foodborne Illness Event Characterization |
title_short | Open Source Health Intelligence (OSHINT) for Foodborne Illness Event Characterization |
title_sort | open source health intelligence (oshint) for foodborne illness event characterization |
topic | ISDS 2012 Conference Abstracts |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3692774/ |
work_keys_str_mv | AT orduncatherine opensourcehealthintelligenceoshintforfoodborneillnesseventcharacterization AT blakejanew opensourcehealthintelligenceoshintforfoodborneillnesseventcharacterization AT rosidinathanael opensourcehealthintelligenceoshintforfoodborneillnesseventcharacterization AT grigoryanvahan opensourcehealthintelligenceoshintforfoodborneillnesseventcharacterization AT reffettchristopher opensourcehealthintelligenceoshintforfoodborneillnesseventcharacterization AT aslamsadia opensourcehealthintelligenceoshintforfoodborneillnesseventcharacterization AT gentilcoreanastasia opensourcehealthintelligenceoshintforfoodborneillnesseventcharacterization AT cyranmarek opensourcehealthintelligenceoshintforfoodborneillnesseventcharacterization AT sheltonmatthew opensourcehealthintelligenceoshintforfoodborneillnesseventcharacterization AT klenkjuergen opensourcehealthintelligenceoshintforfoodborneillnesseventcharacterization |