Cargando…

Analysis of False Positive Errors of an Acute Respiratory Infection Text Classifier due to Contextual Features

Text classifiers have been used for biosurveillance tasks to identify patients with diseases or conditions of interest. When compared to a clinical reference standard of 280 cases of Acute Respiratory Infection (ARI), a text classifier consisting of simple rules and NegEx plus string matching for sp...

Descripción completa

Detalles Bibliográficos
Autores principales: South, Brett R., Shen, Shuying, Chapman, Wendy W., Delisle, Sylvain, Samore, Matthew H., Gundlapalli, Adi V.
Formato: Texto
Lenguaje:English
Publicado: American Medical Informatics Association 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3041533/
https://www.ncbi.nlm.nih.gov/pubmed/21347150
_version_ 1782198440866873344
author South, Brett R.
Shen, Shuying
Chapman, Wendy W.
Delisle, Sylvain
Samore, Matthew H.
Gundlapalli, Adi V.
author_facet South, Brett R.
Shen, Shuying
Chapman, Wendy W.
Delisle, Sylvain
Samore, Matthew H.
Gundlapalli, Adi V.
author_sort South, Brett R.
collection PubMed
description Text classifiers have been used for biosurveillance tasks to identify patients with diseases or conditions of interest. When compared to a clinical reference standard of 280 cases of Acute Respiratory Infection (ARI), a text classifier consisting of simple rules and NegEx plus string matching for specific concepts of interest produced 569 (4%) false positive (FP) cases. Using instance level manual annotation we estimate the prevalence of contextual attributes and error types leading to FP cases. Errors were due to (1) Deletion errors from abbreviations, spelling mistakes and missing synonyms (57%); (2) Insertion errors from templated document structures such as check boxes, and lists of signs and symptoms (36%) and; (3) Substitution errors from irrelevant concepts and alternate meanings for the same word (6%). We demonstrate that specific concept attributes contribute to false positive cases. These results will inform modifications and adaptations to improve text classifier performance.
format Text
id pubmed-3041533
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher American Medical Informatics Association
record_format MEDLINE/PubMed
spelling pubmed-30415332011-02-23 Analysis of False Positive Errors of an Acute Respiratory Infection Text Classifier due to Contextual Features South, Brett R. Shen, Shuying Chapman, Wendy W. Delisle, Sylvain Samore, Matthew H. Gundlapalli, Adi V. Summit on Translat Bioinforma Articles Text classifiers have been used for biosurveillance tasks to identify patients with diseases or conditions of interest. When compared to a clinical reference standard of 280 cases of Acute Respiratory Infection (ARI), a text classifier consisting of simple rules and NegEx plus string matching for specific concepts of interest produced 569 (4%) false positive (FP) cases. Using instance level manual annotation we estimate the prevalence of contextual attributes and error types leading to FP cases. Errors were due to (1) Deletion errors from abbreviations, spelling mistakes and missing synonyms (57%); (2) Insertion errors from templated document structures such as check boxes, and lists of signs and symptoms (36%) and; (3) Substitution errors from irrelevant concepts and alternate meanings for the same word (6%). We demonstrate that specific concept attributes contribute to false positive cases. These results will inform modifications and adaptations to improve text classifier performance. American Medical Informatics Association 2010-03-01 /pmc/articles/PMC3041533/ /pubmed/21347150 Text en ©2010 AMIA - All rights reserved. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose
spellingShingle Articles
South, Brett R.
Shen, Shuying
Chapman, Wendy W.
Delisle, Sylvain
Samore, Matthew H.
Gundlapalli, Adi V.
Analysis of False Positive Errors of an Acute Respiratory Infection Text Classifier due to Contextual Features
title Analysis of False Positive Errors of an Acute Respiratory Infection Text Classifier due to Contextual Features
title_full Analysis of False Positive Errors of an Acute Respiratory Infection Text Classifier due to Contextual Features
title_fullStr Analysis of False Positive Errors of an Acute Respiratory Infection Text Classifier due to Contextual Features
title_full_unstemmed Analysis of False Positive Errors of an Acute Respiratory Infection Text Classifier due to Contextual Features
title_short Analysis of False Positive Errors of an Acute Respiratory Infection Text Classifier due to Contextual Features
title_sort analysis of false positive errors of an acute respiratory infection text classifier due to contextual features
topic Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3041533/
https://www.ncbi.nlm.nih.gov/pubmed/21347150
work_keys_str_mv AT southbrettr analysisoffalsepositiveerrorsofanacuterespiratoryinfectiontextclassifierduetocontextualfeatures
AT shenshuying analysisoffalsepositiveerrorsofanacuterespiratoryinfectiontextclassifierduetocontextualfeatures
AT chapmanwendyw analysisoffalsepositiveerrorsofanacuterespiratoryinfectiontextclassifierduetocontextualfeatures
AT delislesylvain analysisoffalsepositiveerrorsofanacuterespiratoryinfectiontextclassifierduetocontextualfeatures
AT samorematthewh analysisoffalsepositiveerrorsofanacuterespiratoryinfectiontextclassifierduetocontextualfeatures
AT gundlapalliadiv analysisoffalsepositiveerrorsofanacuterespiratoryinfectiontextclassifierduetocontextualfeatures