Cargando…
Multilingual chief complaint classification for syndromic surveillance: An experiment with Chinese chief complaints
PURPOSE: Syndromic surveillance is aimed at early detection of disease outbreaks. An important data source for syndromic surveillance is free-text chief complaints (CCs), which may be recorded in different languages. For automated syndromic surveillance, CCs must be classified into predefined syndro...
Autores principales: | , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier Ireland Ltd.
2009
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7108263/ https://www.ncbi.nlm.nih.gov/pubmed/18838292 http://dx.doi.org/10.1016/j.ijmedinf.2008.08.004 |
_version_ | 1783512777137061888 |
---|---|
author | Lu, Hsin-Min Chen, Hsinchun Zeng, Daniel King, Chwan-Chuen Shih, Fuh-Yuan Wu, Tsung-Shu Hsiao, Jin-Yi |
author_facet | Lu, Hsin-Min Chen, Hsinchun Zeng, Daniel King, Chwan-Chuen Shih, Fuh-Yuan Wu, Tsung-Shu Hsiao, Jin-Yi |
author_sort | Lu, Hsin-Min |
collection | PubMed |
description | PURPOSE: Syndromic surveillance is aimed at early detection of disease outbreaks. An important data source for syndromic surveillance is free-text chief complaints (CCs), which may be recorded in different languages. For automated syndromic surveillance, CCs must be classified into predefined syndromic categories to facilitate subsequent data aggregation and analysis. Despite the fact that syndromic surveillance is largely an international effort, existing CC classification systems do not provide adequate support for processing CCs recorded in non-English languages. This paper reports a multilingual CC classification effort, focusing on CCs recorded in Chinese. METHODS: We propose a novel Chinese CC classification system leveraging a Chinese-English translation module and an existing English CC classification approach. A set of 470 Chinese key phrases was extracted from about one million Chinese CC records using statistical methods. Based on the extracted key phrases, the system translates Chinese text into English and classifies the translated CCs to syndromic categories using an existing English CC classification system. RESULTS: Compared to alternative approaches using a bilingual dictionary and a general-purpose machine translation system, our approach performs significantly better in terms of positive predictive value (PPV or precision), sensitivity (recall), specificity, and F measure (the harmonic mean of PPV and sensitivity), based on a computational experiment using real-world CC records. CONCLUSIONS: Our design provides satisfactory performance in classifying Chinese CCs into syndromic categories for public health surveillance. The overall design of our system also points out a potentially fruitful direction for multilingual CC systems that need to handle languages beyond English and Chinese. |
format | Online Article Text |
id | pubmed-7108263 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2009 |
publisher | Elsevier Ireland Ltd. |
record_format | MEDLINE/PubMed |
spelling | pubmed-71082632020-03-31 Multilingual chief complaint classification for syndromic surveillance: An experiment with Chinese chief complaints Lu, Hsin-Min Chen, Hsinchun Zeng, Daniel King, Chwan-Chuen Shih, Fuh-Yuan Wu, Tsung-Shu Hsiao, Jin-Yi Int J Med Inform Article PURPOSE: Syndromic surveillance is aimed at early detection of disease outbreaks. An important data source for syndromic surveillance is free-text chief complaints (CCs), which may be recorded in different languages. For automated syndromic surveillance, CCs must be classified into predefined syndromic categories to facilitate subsequent data aggregation and analysis. Despite the fact that syndromic surveillance is largely an international effort, existing CC classification systems do not provide adequate support for processing CCs recorded in non-English languages. This paper reports a multilingual CC classification effort, focusing on CCs recorded in Chinese. METHODS: We propose a novel Chinese CC classification system leveraging a Chinese-English translation module and an existing English CC classification approach. A set of 470 Chinese key phrases was extracted from about one million Chinese CC records using statistical methods. Based on the extracted key phrases, the system translates Chinese text into English and classifies the translated CCs to syndromic categories using an existing English CC classification system. RESULTS: Compared to alternative approaches using a bilingual dictionary and a general-purpose machine translation system, our approach performs significantly better in terms of positive predictive value (PPV or precision), sensitivity (recall), specificity, and F measure (the harmonic mean of PPV and sensitivity), based on a computational experiment using real-world CC records. CONCLUSIONS: Our design provides satisfactory performance in classifying Chinese CCs into syndromic categories for public health surveillance. The overall design of our system also points out a potentially fruitful direction for multilingual CC systems that need to handle languages beyond English and Chinese. Elsevier Ireland Ltd. 2009-05 2008-10-05 /pmc/articles/PMC7108263/ /pubmed/18838292 http://dx.doi.org/10.1016/j.ijmedinf.2008.08.004 Text en Copyright © 2008 Elsevier Ireland Ltd. All rights reserved. Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active. |
spellingShingle | Article Lu, Hsin-Min Chen, Hsinchun Zeng, Daniel King, Chwan-Chuen Shih, Fuh-Yuan Wu, Tsung-Shu Hsiao, Jin-Yi Multilingual chief complaint classification for syndromic surveillance: An experiment with Chinese chief complaints |
title | Multilingual chief complaint classification for syndromic surveillance: An experiment with Chinese chief complaints |
title_full | Multilingual chief complaint classification for syndromic surveillance: An experiment with Chinese chief complaints |
title_fullStr | Multilingual chief complaint classification for syndromic surveillance: An experiment with Chinese chief complaints |
title_full_unstemmed | Multilingual chief complaint classification for syndromic surveillance: An experiment with Chinese chief complaints |
title_short | Multilingual chief complaint classification for syndromic surveillance: An experiment with Chinese chief complaints |
title_sort | multilingual chief complaint classification for syndromic surveillance: an experiment with chinese chief complaints |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7108263/ https://www.ncbi.nlm.nih.gov/pubmed/18838292 http://dx.doi.org/10.1016/j.ijmedinf.2008.08.004 |
work_keys_str_mv | AT luhsinmin multilingualchiefcomplaintclassificationforsyndromicsurveillanceanexperimentwithchinesechiefcomplaints AT chenhsinchun multilingualchiefcomplaintclassificationforsyndromicsurveillanceanexperimentwithchinesechiefcomplaints AT zengdaniel multilingualchiefcomplaintclassificationforsyndromicsurveillanceanexperimentwithchinesechiefcomplaints AT kingchwanchuen multilingualchiefcomplaintclassificationforsyndromicsurveillanceanexperimentwithchinesechiefcomplaints AT shihfuhyuan multilingualchiefcomplaintclassificationforsyndromicsurveillanceanexperimentwithchinesechiefcomplaints AT wutsungshu multilingualchiefcomplaintclassificationforsyndromicsurveillanceanexperimentwithchinesechiefcomplaints AT hsiaojinyi multilingualchiefcomplaintclassificationforsyndromicsurveillanceanexperimentwithchinesechiefcomplaints |