Cargando…

Multilingual chief complaint classification for syndromic surveillance: An experiment with Chinese chief complaints

PURPOSE: Syndromic surveillance is aimed at early detection of disease outbreaks. An important data source for syndromic surveillance is free-text chief complaints (CCs), which may be recorded in different languages. For automated syndromic surveillance, CCs must be classified into predefined syndro...

Descripción completa

Detalles Bibliográficos
Autores principales: Lu, Hsin-Min, Chen, Hsinchun, Zeng, Daniel, King, Chwan-Chuen, Shih, Fuh-Yuan, Wu, Tsung-Shu, Hsiao, Jin-Yi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier Ireland Ltd. 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7108263/
https://www.ncbi.nlm.nih.gov/pubmed/18838292
http://dx.doi.org/10.1016/j.ijmedinf.2008.08.004
_version_ 1783512777137061888
author Lu, Hsin-Min
Chen, Hsinchun
Zeng, Daniel
King, Chwan-Chuen
Shih, Fuh-Yuan
Wu, Tsung-Shu
Hsiao, Jin-Yi
author_facet Lu, Hsin-Min
Chen, Hsinchun
Zeng, Daniel
King, Chwan-Chuen
Shih, Fuh-Yuan
Wu, Tsung-Shu
Hsiao, Jin-Yi
author_sort Lu, Hsin-Min
collection PubMed
description PURPOSE: Syndromic surveillance is aimed at early detection of disease outbreaks. An important data source for syndromic surveillance is free-text chief complaints (CCs), which may be recorded in different languages. For automated syndromic surveillance, CCs must be classified into predefined syndromic categories to facilitate subsequent data aggregation and analysis. Despite the fact that syndromic surveillance is largely an international effort, existing CC classification systems do not provide adequate support for processing CCs recorded in non-English languages. This paper reports a multilingual CC classification effort, focusing on CCs recorded in Chinese. METHODS: We propose a novel Chinese CC classification system leveraging a Chinese-English translation module and an existing English CC classification approach. A set of 470 Chinese key phrases was extracted from about one million Chinese CC records using statistical methods. Based on the extracted key phrases, the system translates Chinese text into English and classifies the translated CCs to syndromic categories using an existing English CC classification system. RESULTS: Compared to alternative approaches using a bilingual dictionary and a general-purpose machine translation system, our approach performs significantly better in terms of positive predictive value (PPV or precision), sensitivity (recall), specificity, and F measure (the harmonic mean of PPV and sensitivity), based on a computational experiment using real-world CC records. CONCLUSIONS: Our design provides satisfactory performance in classifying Chinese CCs into syndromic categories for public health surveillance. The overall design of our system also points out a potentially fruitful direction for multilingual CC systems that need to handle languages beyond English and Chinese.
format Online
Article
Text
id pubmed-7108263
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher Elsevier Ireland Ltd.
record_format MEDLINE/PubMed
spelling pubmed-71082632020-03-31 Multilingual chief complaint classification for syndromic surveillance: An experiment with Chinese chief complaints Lu, Hsin-Min Chen, Hsinchun Zeng, Daniel King, Chwan-Chuen Shih, Fuh-Yuan Wu, Tsung-Shu Hsiao, Jin-Yi Int J Med Inform Article PURPOSE: Syndromic surveillance is aimed at early detection of disease outbreaks. An important data source for syndromic surveillance is free-text chief complaints (CCs), which may be recorded in different languages. For automated syndromic surveillance, CCs must be classified into predefined syndromic categories to facilitate subsequent data aggregation and analysis. Despite the fact that syndromic surveillance is largely an international effort, existing CC classification systems do not provide adequate support for processing CCs recorded in non-English languages. This paper reports a multilingual CC classification effort, focusing on CCs recorded in Chinese. METHODS: We propose a novel Chinese CC classification system leveraging a Chinese-English translation module and an existing English CC classification approach. A set of 470 Chinese key phrases was extracted from about one million Chinese CC records using statistical methods. Based on the extracted key phrases, the system translates Chinese text into English and classifies the translated CCs to syndromic categories using an existing English CC classification system. RESULTS: Compared to alternative approaches using a bilingual dictionary and a general-purpose machine translation system, our approach performs significantly better in terms of positive predictive value (PPV or precision), sensitivity (recall), specificity, and F measure (the harmonic mean of PPV and sensitivity), based on a computational experiment using real-world CC records. CONCLUSIONS: Our design provides satisfactory performance in classifying Chinese CCs into syndromic categories for public health surveillance. The overall design of our system also points out a potentially fruitful direction for multilingual CC systems that need to handle languages beyond English and Chinese. Elsevier Ireland Ltd. 2009-05 2008-10-05 /pmc/articles/PMC7108263/ /pubmed/18838292 http://dx.doi.org/10.1016/j.ijmedinf.2008.08.004 Text en Copyright © 2008 Elsevier Ireland Ltd. All rights reserved. Since January 2020 Elsevier has created a COVID-19 resource centre with free information in English and Mandarin on the novel coronavirus COVID-19. The COVID-19 resource centre is hosted on Elsevier Connect, the company's public news and information website. Elsevier hereby grants permission to make all its COVID-19-related research that is available on the COVID-19 resource centre - including this research content - immediately available in PubMed Central and other publicly funded repositories, such as the WHO COVID database with rights for unrestricted research re-use and analyses in any form or by any means with acknowledgement of the original source. These permissions are granted for free by Elsevier for as long as the COVID-19 resource centre remains active.
spellingShingle Article
Lu, Hsin-Min
Chen, Hsinchun
Zeng, Daniel
King, Chwan-Chuen
Shih, Fuh-Yuan
Wu, Tsung-Shu
Hsiao, Jin-Yi
Multilingual chief complaint classification for syndromic surveillance: An experiment with Chinese chief complaints
title Multilingual chief complaint classification for syndromic surveillance: An experiment with Chinese chief complaints
title_full Multilingual chief complaint classification for syndromic surveillance: An experiment with Chinese chief complaints
title_fullStr Multilingual chief complaint classification for syndromic surveillance: An experiment with Chinese chief complaints
title_full_unstemmed Multilingual chief complaint classification for syndromic surveillance: An experiment with Chinese chief complaints
title_short Multilingual chief complaint classification for syndromic surveillance: An experiment with Chinese chief complaints
title_sort multilingual chief complaint classification for syndromic surveillance: an experiment with chinese chief complaints
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7108263/
https://www.ncbi.nlm.nih.gov/pubmed/18838292
http://dx.doi.org/10.1016/j.ijmedinf.2008.08.004
work_keys_str_mv AT luhsinmin multilingualchiefcomplaintclassificationforsyndromicsurveillanceanexperimentwithchinesechiefcomplaints
AT chenhsinchun multilingualchiefcomplaintclassificationforsyndromicsurveillanceanexperimentwithchinesechiefcomplaints
AT zengdaniel multilingualchiefcomplaintclassificationforsyndromicsurveillanceanexperimentwithchinesechiefcomplaints
AT kingchwanchuen multilingualchiefcomplaintclassificationforsyndromicsurveillanceanexperimentwithchinesechiefcomplaints
AT shihfuhyuan multilingualchiefcomplaintclassificationforsyndromicsurveillanceanexperimentwithchinesechiefcomplaints
AT wutsungshu multilingualchiefcomplaintclassificationforsyndromicsurveillanceanexperimentwithchinesechiefcomplaints
AT hsiaojinyi multilingualchiefcomplaintclassificationforsyndromicsurveillanceanexperimentwithchinesechiefcomplaints