Cargando…

Using n-Grams for Syndromic Surveillance in a Turkish Emergency Department Without English Translation: A Feasibility Study

INTRODUCTION: Syndromic surveillance is designed for early detection of disease outbreaks. An important data source for syndromic surveillance is free-text chief complaints (CCs), which are generally recorded in the local language. For automated syndromic surveillance, CCs must be classified into pr...

Descripción completa

Detalles Bibliográficos
Autores principales: Halász, Sylvia, Brown, Philip, Oktay, Cem, Çevik, Arif Alper, Kılıçaslan, Isa, Goodall, Colin, Cochrane, Dennis G, Fowler, Thomas R, Jacobson, Guy, Tse, Simon, Allegra, John R
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Libertas Academica 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3653813/
https://www.ncbi.nlm.nih.gov/pubmed/23700370
http://dx.doi.org/10.4137/BII.S11334
_version_ 1782269454419230720
author Halász, Sylvia
Brown, Philip
Oktay, Cem
Çevik, Arif Alper
Kılıçaslan, Isa
Goodall, Colin
Cochrane, Dennis G
Fowler, Thomas R
Jacobson, Guy
Tse, Simon
Allegra, John R
author_facet Halász, Sylvia
Brown, Philip
Oktay, Cem
Çevik, Arif Alper
Kılıçaslan, Isa
Goodall, Colin
Cochrane, Dennis G
Fowler, Thomas R
Jacobson, Guy
Tse, Simon
Allegra, John R
author_sort Halász, Sylvia
collection PubMed
description INTRODUCTION: Syndromic surveillance is designed for early detection of disease outbreaks. An important data source for syndromic surveillance is free-text chief complaints (CCs), which are generally recorded in the local language. For automated syndromic surveillance, CCs must be classified into predefined syndromic categories. The n-gram classifier is created by using text fragments to measure associations between chief complaints (CC) and a syndromic grouping of ICD codes. OBJECTIVES: The objective was to create a Turkish n-gram CC classifier for the respiratory syndrome and then compare daily volumes between the n-gram CC classifier and a respiratory ICD-10 code grouping on a test set of data. METHODS: The design was a feasibility study based on retrospective cohort data. The setting was a university hospital emergency department (ED) in Turkey. Included were all ED visits in the 2002 database of this hospital. Two of the authors created a respiratory grouping of International Classification of Diseases, 10th Revision ICD-10-CM codes by consensus, chosen to be similar to a standard respiratory (RESP) grouping of ICD codes created by the Electronic Surveillance System for Early Notification of Community-based Epidemics (ESSENCE), a project of the Centers for Disease Control and Prevention. An n-gram method adapted from AT&T Labs’ technologies was applied to the first 10 months of data as a training set to create a Turkish CC RESP classifier. The classifier was then tested on the subsequent 2 months of visits to generate a time series graph and determine the correlation with daily volumes measured by the CC classifier versus the RESP ICD-10 grouping. RESULTS: The Turkish ED database contained 30,157 visits. The correlation (R(2)) of n-gram versus ICD-10 for the test set was 0.78. CONCLUSION: The n-gram method automatically created a CC RESP classifier of the Turkish CCs that performed similarly to the ICD-10 RESP grouping. The n-gram technique has the advantage of systematic, consistent, and rapid deployment as well as language independence.
format Online
Article
Text
id pubmed-3653813
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Libertas Academica
record_format MEDLINE/PubMed
spelling pubmed-36538132013-05-22 Using n-Grams for Syndromic Surveillance in a Turkish Emergency Department Without English Translation: A Feasibility Study Halász, Sylvia Brown, Philip Oktay, Cem Çevik, Arif Alper Kılıçaslan, Isa Goodall, Colin Cochrane, Dennis G Fowler, Thomas R Jacobson, Guy Tse, Simon Allegra, John R Biomed Inform Insights Original Research INTRODUCTION: Syndromic surveillance is designed for early detection of disease outbreaks. An important data source for syndromic surveillance is free-text chief complaints (CCs), which are generally recorded in the local language. For automated syndromic surveillance, CCs must be classified into predefined syndromic categories. The n-gram classifier is created by using text fragments to measure associations between chief complaints (CC) and a syndromic grouping of ICD codes. OBJECTIVES: The objective was to create a Turkish n-gram CC classifier for the respiratory syndrome and then compare daily volumes between the n-gram CC classifier and a respiratory ICD-10 code grouping on a test set of data. METHODS: The design was a feasibility study based on retrospective cohort data. The setting was a university hospital emergency department (ED) in Turkey. Included were all ED visits in the 2002 database of this hospital. Two of the authors created a respiratory grouping of International Classification of Diseases, 10th Revision ICD-10-CM codes by consensus, chosen to be similar to a standard respiratory (RESP) grouping of ICD codes created by the Electronic Surveillance System for Early Notification of Community-based Epidemics (ESSENCE), a project of the Centers for Disease Control and Prevention. An n-gram method adapted from AT&T Labs’ technologies was applied to the first 10 months of data as a training set to create a Turkish CC RESP classifier. The classifier was then tested on the subsequent 2 months of visits to generate a time series graph and determine the correlation with daily volumes measured by the CC classifier versus the RESP ICD-10 grouping. RESULTS: The Turkish ED database contained 30,157 visits. The correlation (R(2)) of n-gram versus ICD-10 for the test set was 0.78. CONCLUSION: The n-gram method automatically created a CC RESP classifier of the Turkish CCs that performed similarly to the ICD-10 RESP grouping. The n-gram technique has the advantage of systematic, consistent, and rapid deployment as well as language independence. Libertas Academica 2013-04-25 /pmc/articles/PMC3653813/ /pubmed/23700370 http://dx.doi.org/10.4137/BII.S11334 Text en © 2013 the author(s), publisher and licensee Libertas Academica Ltd. This is an open access article published under the Creative Commons CC-BY-NC 3.0 license.
spellingShingle Original Research
Halász, Sylvia
Brown, Philip
Oktay, Cem
Çevik, Arif Alper
Kılıçaslan, Isa
Goodall, Colin
Cochrane, Dennis G
Fowler, Thomas R
Jacobson, Guy
Tse, Simon
Allegra, John R
Using n-Grams for Syndromic Surveillance in a Turkish Emergency Department Without English Translation: A Feasibility Study
title Using n-Grams for Syndromic Surveillance in a Turkish Emergency Department Without English Translation: A Feasibility Study
title_full Using n-Grams for Syndromic Surveillance in a Turkish Emergency Department Without English Translation: A Feasibility Study
title_fullStr Using n-Grams for Syndromic Surveillance in a Turkish Emergency Department Without English Translation: A Feasibility Study
title_full_unstemmed Using n-Grams for Syndromic Surveillance in a Turkish Emergency Department Without English Translation: A Feasibility Study
title_short Using n-Grams for Syndromic Surveillance in a Turkish Emergency Department Without English Translation: A Feasibility Study
title_sort using n-grams for syndromic surveillance in a turkish emergency department without english translation: a feasibility study
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3653813/
https://www.ncbi.nlm.nih.gov/pubmed/23700370
http://dx.doi.org/10.4137/BII.S11334
work_keys_str_mv AT halaszsylvia usingngramsforsyndromicsurveillanceinaturkishemergencydepartmentwithoutenglishtranslationafeasibilitystudy
AT brownphilip usingngramsforsyndromicsurveillanceinaturkishemergencydepartmentwithoutenglishtranslationafeasibilitystudy
AT oktaycem usingngramsforsyndromicsurveillanceinaturkishemergencydepartmentwithoutenglishtranslationafeasibilitystudy
AT cevikarifalper usingngramsforsyndromicsurveillanceinaturkishemergencydepartmentwithoutenglishtranslationafeasibilitystudy
AT kılıcaslanisa usingngramsforsyndromicsurveillanceinaturkishemergencydepartmentwithoutenglishtranslationafeasibilitystudy
AT goodallcolin usingngramsforsyndromicsurveillanceinaturkishemergencydepartmentwithoutenglishtranslationafeasibilitystudy
AT cochranedennisg usingngramsforsyndromicsurveillanceinaturkishemergencydepartmentwithoutenglishtranslationafeasibilitystudy
AT fowlerthomasr usingngramsforsyndromicsurveillanceinaturkishemergencydepartmentwithoutenglishtranslationafeasibilitystudy
AT jacobsonguy usingngramsforsyndromicsurveillanceinaturkishemergencydepartmentwithoutenglishtranslationafeasibilitystudy
AT tsesimon usingngramsforsyndromicsurveillanceinaturkishemergencydepartmentwithoutenglishtranslationafeasibilitystudy
AT allegrajohnr usingngramsforsyndromicsurveillanceinaturkishemergencydepartmentwithoutenglishtranslationafeasibilitystudy