Cargando…
Application of Text Information Extraction System for Real-Time Cancer Case Identification in an Integrated Healthcare Organization
BACKGROUND: Surgical pathology reports (SPR) contain rich clinical diagnosis information. The text information extraction system (TIES) is an end-to-end application leveraging natural language processing technologies and focused on the processing of pathology and/or radiology reports. METHODS: We de...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Medknow Publications & Media Pvt Ltd
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5760847/ https://www.ncbi.nlm.nih.gov/pubmed/29416911 http://dx.doi.org/10.4103/jpi.jpi_55_17 |
_version_ | 1783291448089640960 |
---|---|
author | Xie, Fagen Lee, Janet Munoz-Plaza, Corrine E. Hahn, Erin E. Chen, Wansu |
author_facet | Xie, Fagen Lee, Janet Munoz-Plaza, Corrine E. Hahn, Erin E. Chen, Wansu |
author_sort | Xie, Fagen |
collection | PubMed |
description | BACKGROUND: Surgical pathology reports (SPR) contain rich clinical diagnosis information. The text information extraction system (TIES) is an end-to-end application leveraging natural language processing technologies and focused on the processing of pathology and/or radiology reports. METHODS: We deployed the TIES system and integrated SPRs into the TIES system on a daily basis at Kaiser Permanente Southern California. The breast cancer cases diagnosed in December 2013 from the Cancer Registry (CANREG) were used to validate the performance of the TIES system. The National Cancer Institute Metathesaurus (NCIM) concept terms and codes to describe breast cancer were identified through the Unified Medical Language System Terminology Service (UTS) application. The identified NCIM codes were used to search for the coded SPRs in the back-end datastore directly. The identified cases were then compared with the breast cancer patients pulled from CANREG. RESULTS: A total of 437 breast cancer concept terms and 14 combinations of “breast“and “cancer“ terms were identified from the UTS application. A total of 249 breast cancer cases diagnosed in December 2013 was pulled from CANREG. Out of these 249 cases, 241 were successfully identified by the TIES system from a total of 457 reports. The TIES system also identified an additional 277 cases that were not part of the validation sample. Out of the 277 cases, 11% were determined as highly likely to be cases after manual examinations, and 86% were in CANREG but were diagnosed in months other than December of 2013. CONCLUSIONS: The study demonstrated that the TIES system can effectively identify potential breast cancer cases in our care setting. Identified potential cases can be easily confirmed by reviewing the corresponding annotated reports through the front-end visualization interface. The TIES system is a great tool for identifying potential various cancer cases in a timely manner and on a regular basis in support of clinical research studies. |
format | Online Article Text |
id | pubmed-5760847 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | Medknow Publications & Media Pvt Ltd |
record_format | MEDLINE/PubMed |
spelling | pubmed-57608472018-02-07 Application of Text Information Extraction System for Real-Time Cancer Case Identification in an Integrated Healthcare Organization Xie, Fagen Lee, Janet Munoz-Plaza, Corrine E. Hahn, Erin E. Chen, Wansu J Pathol Inform Original Article BACKGROUND: Surgical pathology reports (SPR) contain rich clinical diagnosis information. The text information extraction system (TIES) is an end-to-end application leveraging natural language processing technologies and focused on the processing of pathology and/or radiology reports. METHODS: We deployed the TIES system and integrated SPRs into the TIES system on a daily basis at Kaiser Permanente Southern California. The breast cancer cases diagnosed in December 2013 from the Cancer Registry (CANREG) were used to validate the performance of the TIES system. The National Cancer Institute Metathesaurus (NCIM) concept terms and codes to describe breast cancer were identified through the Unified Medical Language System Terminology Service (UTS) application. The identified NCIM codes were used to search for the coded SPRs in the back-end datastore directly. The identified cases were then compared with the breast cancer patients pulled from CANREG. RESULTS: A total of 437 breast cancer concept terms and 14 combinations of “breast“and “cancer“ terms were identified from the UTS application. A total of 249 breast cancer cases diagnosed in December 2013 was pulled from CANREG. Out of these 249 cases, 241 were successfully identified by the TIES system from a total of 457 reports. The TIES system also identified an additional 277 cases that were not part of the validation sample. Out of the 277 cases, 11% were determined as highly likely to be cases after manual examinations, and 86% were in CANREG but were diagnosed in months other than December of 2013. CONCLUSIONS: The study demonstrated that the TIES system can effectively identify potential breast cancer cases in our care setting. Identified potential cases can be easily confirmed by reviewing the corresponding annotated reports through the front-end visualization interface. The TIES system is a great tool for identifying potential various cancer cases in a timely manner and on a regular basis in support of clinical research studies. Medknow Publications & Media Pvt Ltd 2017-12-14 /pmc/articles/PMC5760847/ /pubmed/29416911 http://dx.doi.org/10.4103/jpi.jpi_55_17 Text en Copyright: © 2017 Journal of Pathology Informatics http://creativecommons.org/licenses/by-nc-sa/3.0 This is an open access article distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License, which allows others to remix, tweak, and build upon the work non-commercially, as long as the author is credited and the new creations are licensed under the identical terms. |
spellingShingle | Original Article Xie, Fagen Lee, Janet Munoz-Plaza, Corrine E. Hahn, Erin E. Chen, Wansu Application of Text Information Extraction System for Real-Time Cancer Case Identification in an Integrated Healthcare Organization |
title | Application of Text Information Extraction System for Real-Time Cancer Case Identification in an Integrated Healthcare Organization |
title_full | Application of Text Information Extraction System for Real-Time Cancer Case Identification in an Integrated Healthcare Organization |
title_fullStr | Application of Text Information Extraction System for Real-Time Cancer Case Identification in an Integrated Healthcare Organization |
title_full_unstemmed | Application of Text Information Extraction System for Real-Time Cancer Case Identification in an Integrated Healthcare Organization |
title_short | Application of Text Information Extraction System for Real-Time Cancer Case Identification in an Integrated Healthcare Organization |
title_sort | application of text information extraction system for real-time cancer case identification in an integrated healthcare organization |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5760847/ https://www.ncbi.nlm.nih.gov/pubmed/29416911 http://dx.doi.org/10.4103/jpi.jpi_55_17 |
work_keys_str_mv | AT xiefagen applicationoftextinformationextractionsystemforrealtimecancercaseidentificationinanintegratedhealthcareorganization AT leejanet applicationoftextinformationextractionsystemforrealtimecancercaseidentificationinanintegratedhealthcareorganization AT munozplazacorrinee applicationoftextinformationextractionsystemforrealtimecancercaseidentificationinanintegratedhealthcareorganization AT hahnerine applicationoftextinformationextractionsystemforrealtimecancercaseidentificationinanintegratedhealthcareorganization AT chenwansu applicationoftextinformationextractionsystemforrealtimecancercaseidentificationinanintegratedhealthcareorganization |