Cargando…
Concept Discovery for Pathology Reports using an N-gram Model
A large amount of valuable information is available in plain text clinical reports. New techniques and technologies are applied to extract information from these reports. One of the leading systems in the cancer community is the Cancer Text Information Extraction System (caTIES), which was developed...
Autores principales: | , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
American Medical Informatics Association
2010
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3041542/ https://www.ncbi.nlm.nih.gov/pubmed/21347147 |
_version_ | 1782198443025891328 |
---|---|
author | Yip, Vincent Mete, Mutlu Topaloglu, Umit Kockara, Sinan |
author_facet | Yip, Vincent Mete, Mutlu Topaloglu, Umit Kockara, Sinan |
author_sort | Yip, Vincent |
collection | PubMed |
description | A large amount of valuable information is available in plain text clinical reports. New techniques and technologies are applied to extract information from these reports. One of the leading systems in the cancer community is the Cancer Text Information Extraction System (caTIES), which was developed with caBIG-compliant data structures. caTIES embedded two key components for extracting data: MMTx and GATE. In this paper, an n-gram based framework is proven to be capable of discovering concepts from text reports. MetaMap is used to map medical terms to the National Cancer Institute (NCI) Metathesaurus and the Unified Medical Language System (UMLS) Metathesaurus for verifying legitimate medical data. The final concepts from our framework and caTIES are weighted based on our scoring model. The scores show that, on average, our framework scores higher than caTIES on 848 (36.9%) of reports. Furthermore, 1388 (60.5%) of reports have similar performances on both systems. |
format | Text |
id | pubmed-3041542 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2010 |
publisher | American Medical Informatics Association |
record_format | MEDLINE/PubMed |
spelling | pubmed-30415422011-02-23 Concept Discovery for Pathology Reports using an N-gram Model Yip, Vincent Mete, Mutlu Topaloglu, Umit Kockara, Sinan Summit on Translat Bioinforma Articles A large amount of valuable information is available in plain text clinical reports. New techniques and technologies are applied to extract information from these reports. One of the leading systems in the cancer community is the Cancer Text Information Extraction System (caTIES), which was developed with caBIG-compliant data structures. caTIES embedded two key components for extracting data: MMTx and GATE. In this paper, an n-gram based framework is proven to be capable of discovering concepts from text reports. MetaMap is used to map medical terms to the National Cancer Institute (NCI) Metathesaurus and the Unified Medical Language System (UMLS) Metathesaurus for verifying legitimate medical data. The final concepts from our framework and caTIES are weighted based on our scoring model. The scores show that, on average, our framework scores higher than caTIES on 848 (36.9%) of reports. Furthermore, 1388 (60.5%) of reports have similar performances on both systems. American Medical Informatics Association 2010-03-01 /pmc/articles/PMC3041542/ /pubmed/21347147 Text en ©2010 AMIA - All rights reserved. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose |
spellingShingle | Articles Yip, Vincent Mete, Mutlu Topaloglu, Umit Kockara, Sinan Concept Discovery for Pathology Reports using an N-gram Model |
title | Concept Discovery for Pathology Reports using an N-gram Model |
title_full | Concept Discovery for Pathology Reports using an N-gram Model |
title_fullStr | Concept Discovery for Pathology Reports using an N-gram Model |
title_full_unstemmed | Concept Discovery for Pathology Reports using an N-gram Model |
title_short | Concept Discovery for Pathology Reports using an N-gram Model |
title_sort | concept discovery for pathology reports using an n-gram model |
topic | Articles |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3041542/ https://www.ncbi.nlm.nih.gov/pubmed/21347147 |
work_keys_str_mv | AT yipvincent conceptdiscoveryforpathologyreportsusinganngrammodel AT metemutlu conceptdiscoveryforpathologyreportsusinganngrammodel AT topalogluumit conceptdiscoveryforpathologyreportsusinganngrammodel AT kockarasinan conceptdiscoveryforpathologyreportsusinganngrammodel |