Cargando…

Identifying relevant medical reports from an assorted report collection using the multinomial naïve Bayes classifier and the UMLS

A patient’s electronic medical record contains a large number of medical reports and imaging studies. Identifying the relevant information in order to make a diagnosis can be a time consuming process that can easily overwhelm the physician. Summarizing key clinical information for physicians evaluat...

Descripción completa

Detalles Bibliográficos
Autores principales: Bashyam, Vijayaraghavan, Morioka, Craig, El-Saden, Suzie, Bui, Alex AT, Taira, Ricky K
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9592058/
https://www.ncbi.nlm.nih.gov/pubmed/36284749
_version_ 1784814838951706624
author Bashyam, Vijayaraghavan
Morioka, Craig
El-Saden, Suzie
Bui, Alex AT
Taira, Ricky K
author_facet Bashyam, Vijayaraghavan
Morioka, Craig
El-Saden, Suzie
Bui, Alex AT
Taira, Ricky K
author_sort Bashyam, Vijayaraghavan
collection PubMed
description A patient’s electronic medical record contains a large number of medical reports and imaging studies. Identifying the relevant information in order to make a diagnosis can be a time consuming process that can easily overwhelm the physician. Summarizing key clinical information for physicians evaluating brain tumor patients is an ongoing research project at our institution. Notably, identifying documents associated with brain tumor is an important step in collecting the data relevant for summarization. Current electronic medical record systems lack meta-information which is useful in structuring heterogeneous medical information. Thus, identifying reports relevant to a particular task cannot be easily retrieved from a structured database. This necessitates content analysis methods for identifying relevant reports. This paper reports a system designed to identify brain-tumor related reports from an assorted collection of clinical reports. A large collection of clinical reports was obtained from our university hospital database. A domain expert manually annotated the documents classifying them into `related’ and ùnrelated’ categories. A multinomial naïve Bayes classifier was trained to use word level and UMLS concept level features from the reports to identify brain tumor related reports from the assorted collection. The system was trained on 90% and tested on 10% of the manually annotated corpus. A ten-fold cross validation is reported. Performance of the system was best (f-score 94.7) when the system was trained using both word level and UMLS concept level features. Using UMLS concepts improved classifier accuracy.
format Online
Article
Text
id pubmed-9592058
institution National Center for Biotechnology Information
language English
publishDate 2007
record_format MEDLINE/PubMed
spelling pubmed-95920582022-10-24 Identifying relevant medical reports from an assorted report collection using the multinomial naïve Bayes classifier and the UMLS Bashyam, Vijayaraghavan Morioka, Craig El-Saden, Suzie Bui, Alex AT Taira, Ricky K Indian J Med Inform Article A patient’s electronic medical record contains a large number of medical reports and imaging studies. Identifying the relevant information in order to make a diagnosis can be a time consuming process that can easily overwhelm the physician. Summarizing key clinical information for physicians evaluating brain tumor patients is an ongoing research project at our institution. Notably, identifying documents associated with brain tumor is an important step in collecting the data relevant for summarization. Current electronic medical record systems lack meta-information which is useful in structuring heterogeneous medical information. Thus, identifying reports relevant to a particular task cannot be easily retrieved from a structured database. This necessitates content analysis methods for identifying relevant reports. This paper reports a system designed to identify brain-tumor related reports from an assorted collection of clinical reports. A large collection of clinical reports was obtained from our university hospital database. A domain expert manually annotated the documents classifying them into `related’ and ùnrelated’ categories. A multinomial naïve Bayes classifier was trained to use word level and UMLS concept level features from the reports to identify brain tumor related reports from the assorted collection. The system was trained on 90% and tested on 10% of the manually annotated corpus. A ten-fold cross validation is reported. Performance of the system was best (f-score 94.7) when the system was trained using both word level and UMLS concept level features. Using UMLS concepts improved classifier accuracy. 2007 /pmc/articles/PMC9592058/ /pubmed/36284749 Text en https://creativecommons.org/licenses/by/3.0/licensee Indian Journal of Medical Informatics under Creative Commons Attribution-No Derivative Works 3.0 License.
spellingShingle Article
Bashyam, Vijayaraghavan
Morioka, Craig
El-Saden, Suzie
Bui, Alex AT
Taira, Ricky K
Identifying relevant medical reports from an assorted report collection using the multinomial naïve Bayes classifier and the UMLS
title Identifying relevant medical reports from an assorted report collection using the multinomial naïve Bayes classifier and the UMLS
title_full Identifying relevant medical reports from an assorted report collection using the multinomial naïve Bayes classifier and the UMLS
title_fullStr Identifying relevant medical reports from an assorted report collection using the multinomial naïve Bayes classifier and the UMLS
title_full_unstemmed Identifying relevant medical reports from an assorted report collection using the multinomial naïve Bayes classifier and the UMLS
title_short Identifying relevant medical reports from an assorted report collection using the multinomial naïve Bayes classifier and the UMLS
title_sort identifying relevant medical reports from an assorted report collection using the multinomial naïve bayes classifier and the umls
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9592058/
https://www.ncbi.nlm.nih.gov/pubmed/36284749
work_keys_str_mv AT bashyamvijayaraghavan identifyingrelevantmedicalreportsfromanassortedreportcollectionusingthemultinomialnaivebayesclassifierandtheumls
AT moriokacraig identifyingrelevantmedicalreportsfromanassortedreportcollectionusingthemultinomialnaivebayesclassifierandtheumls
AT elsadensuzie identifyingrelevantmedicalreportsfromanassortedreportcollectionusingthemultinomialnaivebayesclassifierandtheumls
AT buialexat identifyingrelevantmedicalreportsfromanassortedreportcollectionusingthemultinomialnaivebayesclassifierandtheumls
AT tairarickyk identifyingrelevantmedicalreportsfromanassortedreportcollectionusingthemultinomialnaivebayesclassifierandtheumls