Cargando…
Prioritization of Free-Text Clinical Documents: A Novel Use of a Bayesian Classifier
BACKGROUND: The amount of incoming data into physicians’ offices is increasing, thereby making it difficult to process information efficiently and accurately to maximize positive patient outcomes. Current manual processes of screening for individual terms within long free-text documents are tedious...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Gunther Eysenbach
2015
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4409648/ https://www.ncbi.nlm.nih.gov/pubmed/25863643 http://dx.doi.org/10.2196/medinform.3793 |
_version_ | 1782368217038061568 |
---|---|
author | Singh, Mark Murthy, Akansh Singh, Shridhar |
author_facet | Singh, Mark Murthy, Akansh Singh, Shridhar |
author_sort | Singh, Mark |
collection | PubMed |
description | BACKGROUND: The amount of incoming data into physicians’ offices is increasing, thereby making it difficult to process information efficiently and accurately to maximize positive patient outcomes. Current manual processes of screening for individual terms within long free-text documents are tedious and error-prone. This paper explores the use of statistical methods and computer systems to assist clinical data management. OBJECTIVE: The objective of this study was to verify and validate the use of a naive Bayesian classifier as a means of properly prioritizing important clinical data, specifically that of free-text radiology reports. METHODS: There were one hundred reports that were first used to train the algorithm based on physicians’ categorization of clinical reports as high-priority or low-priority. Then, the algorithm was used to evaluate 354 reports. Additional beautification procedures such as section extraction, text preprocessing, and negation detection were performed. RESULTS: The algorithm evaluated the 354 reports with discrimination between high-priority and low-priority reports, resulting in a bimodal probability distribution. In all scenarios tested, the false negative rates were below 1.1% and the recall rates ranged from 95.65% to 98.91%. In the case of 50% prior probability and 80% threshold probability, the accuracy of this Bayesian classifier was 93.50%, with a positive predictive value (precision) of 80.54%. It also showed a sensitivity (recall) of 98.91% and a F-measure of 88.78%. CONCLUSIONS: The results showed that the algorithm could be trained to detect abnormal radiology results by accurately screening clinical reports. Such a technique can potentially be used to enable automatic flagging of critical results. In addition to accuracy, the algorithm was able to minimize false negatives, which is important for clinical applications. We conclude that a Bayesian statistical classifier, by flagging reports with abnormal findings, can assist a physician in reviewing radiology reports more efficiently. This higher level of prioritization allows physicians to address important radiologic findings in a timelier manner and may also aid in minimizing errors of omission. |
format | Online Article Text |
id | pubmed-4409648 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2015 |
publisher | Gunther Eysenbach |
record_format | MEDLINE/PubMed |
spelling | pubmed-44096482015-05-08 Prioritization of Free-Text Clinical Documents: A Novel Use of a Bayesian Classifier Singh, Mark Murthy, Akansh Singh, Shridhar JMIR Med Inform Original Paper BACKGROUND: The amount of incoming data into physicians’ offices is increasing, thereby making it difficult to process information efficiently and accurately to maximize positive patient outcomes. Current manual processes of screening for individual terms within long free-text documents are tedious and error-prone. This paper explores the use of statistical methods and computer systems to assist clinical data management. OBJECTIVE: The objective of this study was to verify and validate the use of a naive Bayesian classifier as a means of properly prioritizing important clinical data, specifically that of free-text radiology reports. METHODS: There were one hundred reports that were first used to train the algorithm based on physicians’ categorization of clinical reports as high-priority or low-priority. Then, the algorithm was used to evaluate 354 reports. Additional beautification procedures such as section extraction, text preprocessing, and negation detection were performed. RESULTS: The algorithm evaluated the 354 reports with discrimination between high-priority and low-priority reports, resulting in a bimodal probability distribution. In all scenarios tested, the false negative rates were below 1.1% and the recall rates ranged from 95.65% to 98.91%. In the case of 50% prior probability and 80% threshold probability, the accuracy of this Bayesian classifier was 93.50%, with a positive predictive value (precision) of 80.54%. It also showed a sensitivity (recall) of 98.91% and a F-measure of 88.78%. CONCLUSIONS: The results showed that the algorithm could be trained to detect abnormal radiology results by accurately screening clinical reports. Such a technique can potentially be used to enable automatic flagging of critical results. In addition to accuracy, the algorithm was able to minimize false negatives, which is important for clinical applications. We conclude that a Bayesian statistical classifier, by flagging reports with abnormal findings, can assist a physician in reviewing radiology reports more efficiently. This higher level of prioritization allows physicians to address important radiologic findings in a timelier manner and may also aid in minimizing errors of omission. Gunther Eysenbach 2015-04-10 /pmc/articles/PMC4409648/ /pubmed/25863643 http://dx.doi.org/10.2196/medinform.3793 Text en ©Mark Singh, Akansh Murthy, Shridhar Singh. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 10.04.2015. http://creativecommons.org/licenses/by/2.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on http://medinform.jmir.org/, as well as this copyright and license information must be included. |
spellingShingle | Original Paper Singh, Mark Murthy, Akansh Singh, Shridhar Prioritization of Free-Text Clinical Documents: A Novel Use of a Bayesian Classifier |
title | Prioritization of Free-Text Clinical Documents: A Novel Use of a Bayesian Classifier |
title_full | Prioritization of Free-Text Clinical Documents: A Novel Use of a Bayesian Classifier |
title_fullStr | Prioritization of Free-Text Clinical Documents: A Novel Use of a Bayesian Classifier |
title_full_unstemmed | Prioritization of Free-Text Clinical Documents: A Novel Use of a Bayesian Classifier |
title_short | Prioritization of Free-Text Clinical Documents: A Novel Use of a Bayesian Classifier |
title_sort | prioritization of free-text clinical documents: a novel use of a bayesian classifier |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4409648/ https://www.ncbi.nlm.nih.gov/pubmed/25863643 http://dx.doi.org/10.2196/medinform.3793 |
work_keys_str_mv | AT singhmark prioritizationoffreetextclinicaldocumentsanoveluseofabayesianclassifier AT murthyakansh prioritizationoffreetextclinicaldocumentsanoveluseofabayesianclassifier AT singhshridhar prioritizationoffreetextclinicaldocumentsanoveluseofabayesianclassifier |