Cargando…
Using n-Grams for Syndromic Surveillance in a Turkish Emergency Department Without English Translation: A Feasibility Study
INTRODUCTION: Syndromic surveillance is designed for early detection of disease outbreaks. An important data source for syndromic surveillance is free-text chief complaints (CCs), which are generally recorded in the local language. For automated syndromic surveillance, CCs must be classified into pr...
Autores principales: | , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Libertas Academica
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3653813/ https://www.ncbi.nlm.nih.gov/pubmed/23700370 http://dx.doi.org/10.4137/BII.S11334 |
_version_ | 1782269454419230720 |
---|---|
author | Halász, Sylvia Brown, Philip Oktay, Cem Çevik, Arif Alper Kılıçaslan, Isa Goodall, Colin Cochrane, Dennis G Fowler, Thomas R Jacobson, Guy Tse, Simon Allegra, John R |
author_facet | Halász, Sylvia Brown, Philip Oktay, Cem Çevik, Arif Alper Kılıçaslan, Isa Goodall, Colin Cochrane, Dennis G Fowler, Thomas R Jacobson, Guy Tse, Simon Allegra, John R |
author_sort | Halász, Sylvia |
collection | PubMed |
description | INTRODUCTION: Syndromic surveillance is designed for early detection of disease outbreaks. An important data source for syndromic surveillance is free-text chief complaints (CCs), which are generally recorded in the local language. For automated syndromic surveillance, CCs must be classified into predefined syndromic categories. The n-gram classifier is created by using text fragments to measure associations between chief complaints (CC) and a syndromic grouping of ICD codes. OBJECTIVES: The objective was to create a Turkish n-gram CC classifier for the respiratory syndrome and then compare daily volumes between the n-gram CC classifier and a respiratory ICD-10 code grouping on a test set of data. METHODS: The design was a feasibility study based on retrospective cohort data. The setting was a university hospital emergency department (ED) in Turkey. Included were all ED visits in the 2002 database of this hospital. Two of the authors created a respiratory grouping of International Classification of Diseases, 10th Revision ICD-10-CM codes by consensus, chosen to be similar to a standard respiratory (RESP) grouping of ICD codes created by the Electronic Surveillance System for Early Notification of Community-based Epidemics (ESSENCE), a project of the Centers for Disease Control and Prevention. An n-gram method adapted from AT&T Labs’ technologies was applied to the first 10 months of data as a training set to create a Turkish CC RESP classifier. The classifier was then tested on the subsequent 2 months of visits to generate a time series graph and determine the correlation with daily volumes measured by the CC classifier versus the RESP ICD-10 grouping. RESULTS: The Turkish ED database contained 30,157 visits. The correlation (R(2)) of n-gram versus ICD-10 for the test set was 0.78. CONCLUSION: The n-gram method automatically created a CC RESP classifier of the Turkish CCs that performed similarly to the ICD-10 RESP grouping. The n-gram technique has the advantage of systematic, consistent, and rapid deployment as well as language independence. |
format | Online Article Text |
id | pubmed-3653813 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Libertas Academica |
record_format | MEDLINE/PubMed |
spelling | pubmed-36538132013-05-22 Using n-Grams for Syndromic Surveillance in a Turkish Emergency Department Without English Translation: A Feasibility Study Halász, Sylvia Brown, Philip Oktay, Cem Çevik, Arif Alper Kılıçaslan, Isa Goodall, Colin Cochrane, Dennis G Fowler, Thomas R Jacobson, Guy Tse, Simon Allegra, John R Biomed Inform Insights Original Research INTRODUCTION: Syndromic surveillance is designed for early detection of disease outbreaks. An important data source for syndromic surveillance is free-text chief complaints (CCs), which are generally recorded in the local language. For automated syndromic surveillance, CCs must be classified into predefined syndromic categories. The n-gram classifier is created by using text fragments to measure associations between chief complaints (CC) and a syndromic grouping of ICD codes. OBJECTIVES: The objective was to create a Turkish n-gram CC classifier for the respiratory syndrome and then compare daily volumes between the n-gram CC classifier and a respiratory ICD-10 code grouping on a test set of data. METHODS: The design was a feasibility study based on retrospective cohort data. The setting was a university hospital emergency department (ED) in Turkey. Included were all ED visits in the 2002 database of this hospital. Two of the authors created a respiratory grouping of International Classification of Diseases, 10th Revision ICD-10-CM codes by consensus, chosen to be similar to a standard respiratory (RESP) grouping of ICD codes created by the Electronic Surveillance System for Early Notification of Community-based Epidemics (ESSENCE), a project of the Centers for Disease Control and Prevention. An n-gram method adapted from AT&T Labs’ technologies was applied to the first 10 months of data as a training set to create a Turkish CC RESP classifier. The classifier was then tested on the subsequent 2 months of visits to generate a time series graph and determine the correlation with daily volumes measured by the CC classifier versus the RESP ICD-10 grouping. RESULTS: The Turkish ED database contained 30,157 visits. The correlation (R(2)) of n-gram versus ICD-10 for the test set was 0.78. CONCLUSION: The n-gram method automatically created a CC RESP classifier of the Turkish CCs that performed similarly to the ICD-10 RESP grouping. The n-gram technique has the advantage of systematic, consistent, and rapid deployment as well as language independence. Libertas Academica 2013-04-25 /pmc/articles/PMC3653813/ /pubmed/23700370 http://dx.doi.org/10.4137/BII.S11334 Text en © 2013 the author(s), publisher and licensee Libertas Academica Ltd. This is an open access article published under the Creative Commons CC-BY-NC 3.0 license. |
spellingShingle | Original Research Halász, Sylvia Brown, Philip Oktay, Cem Çevik, Arif Alper Kılıçaslan, Isa Goodall, Colin Cochrane, Dennis G Fowler, Thomas R Jacobson, Guy Tse, Simon Allegra, John R Using n-Grams for Syndromic Surveillance in a Turkish Emergency Department Without English Translation: A Feasibility Study |
title | Using n-Grams for Syndromic Surveillance in a Turkish Emergency Department Without English Translation: A Feasibility Study |
title_full | Using n-Grams for Syndromic Surveillance in a Turkish Emergency Department Without English Translation: A Feasibility Study |
title_fullStr | Using n-Grams for Syndromic Surveillance in a Turkish Emergency Department Without English Translation: A Feasibility Study |
title_full_unstemmed | Using n-Grams for Syndromic Surveillance in a Turkish Emergency Department Without English Translation: A Feasibility Study |
title_short | Using n-Grams for Syndromic Surveillance in a Turkish Emergency Department Without English Translation: A Feasibility Study |
title_sort | using n-grams for syndromic surveillance in a turkish emergency department without english translation: a feasibility study |
topic | Original Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3653813/ https://www.ncbi.nlm.nih.gov/pubmed/23700370 http://dx.doi.org/10.4137/BII.S11334 |
work_keys_str_mv | AT halaszsylvia usingngramsforsyndromicsurveillanceinaturkishemergencydepartmentwithoutenglishtranslationafeasibilitystudy AT brownphilip usingngramsforsyndromicsurveillanceinaturkishemergencydepartmentwithoutenglishtranslationafeasibilitystudy AT oktaycem usingngramsforsyndromicsurveillanceinaturkishemergencydepartmentwithoutenglishtranslationafeasibilitystudy AT cevikarifalper usingngramsforsyndromicsurveillanceinaturkishemergencydepartmentwithoutenglishtranslationafeasibilitystudy AT kılıcaslanisa usingngramsforsyndromicsurveillanceinaturkishemergencydepartmentwithoutenglishtranslationafeasibilitystudy AT goodallcolin usingngramsforsyndromicsurveillanceinaturkishemergencydepartmentwithoutenglishtranslationafeasibilitystudy AT cochranedennisg usingngramsforsyndromicsurveillanceinaturkishemergencydepartmentwithoutenglishtranslationafeasibilitystudy AT fowlerthomasr usingngramsforsyndromicsurveillanceinaturkishemergencydepartmentwithoutenglishtranslationafeasibilitystudy AT jacobsonguy usingngramsforsyndromicsurveillanceinaturkishemergencydepartmentwithoutenglishtranslationafeasibilitystudy AT tsesimon usingngramsforsyndromicsurveillanceinaturkishemergencydepartmentwithoutenglishtranslationafeasibilitystudy AT allegrajohnr usingngramsforsyndromicsurveillanceinaturkishemergencydepartmentwithoutenglishtranslationafeasibilitystudy |