Cargando…
Identifying Metastases-related Information from Pathology Reports of Lung Cancer Patients
Metastatic patterns of spread at the time of cancer recurrence are one of the most important prognostic factors in estimation of clinical course and survival of the patient. This information is not easily accessible since it’s rarely recorded in a structured format. This paper describes a system for...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Medical Informatics Association
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5543353/ https://www.ncbi.nlm.nih.gov/pubmed/28815141 |
_version_ | 1783255133724868608 |
---|---|
author | Soysal, Ergin Warner, Jeremy L Denny, Joshua C Xu, Hua |
author_facet | Soysal, Ergin Warner, Jeremy L Denny, Joshua C Xu, Hua |
author_sort | Soysal, Ergin |
collection | PubMed |
description | Metastatic patterns of spread at the time of cancer recurrence are one of the most important prognostic factors in estimation of clinical course and survival of the patient. This information is not easily accessible since it’s rarely recorded in a structured format. This paper describes a system for categorization of pathology reports by specimen site and the detection of metastatic status within the report. A clinical NLP pipeline was developed using sentence boundary detection, tokenization, section identification, part-of-speech tagger, and chunker with some rule based methods to extract metastasis site and status in combination with five types of information related to tumor metastases: histological type, grade, specimen site, metastatic status indicators and the procedure. The system achieved a recall of 0.84 and 0.88 precision for metastatic status detection, and 0.89 recall and 0.93 precision for metastasis site detection. This study demonstrates the feasibility of applying NLP technologies to extract valuable metastases information from pathology reports and we believe that it will greatly benefit studies on cancer metastases that utilize EHRs. |
format | Online Article Text |
id | pubmed-5543353 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
publisher | American Medical Informatics Association |
record_format | MEDLINE/PubMed |
spelling | pubmed-55433532017-08-16 Identifying Metastases-related Information from Pathology Reports of Lung Cancer Patients Soysal, Ergin Warner, Jeremy L Denny, Joshua C Xu, Hua AMIA Jt Summits Transl Sci Proc Articles Metastatic patterns of spread at the time of cancer recurrence are one of the most important prognostic factors in estimation of clinical course and survival of the patient. This information is not easily accessible since it’s rarely recorded in a structured format. This paper describes a system for categorization of pathology reports by specimen site and the detection of metastatic status within the report. A clinical NLP pipeline was developed using sentence boundary detection, tokenization, section identification, part-of-speech tagger, and chunker with some rule based methods to extract metastasis site and status in combination with five types of information related to tumor metastases: histological type, grade, specimen site, metastatic status indicators and the procedure. The system achieved a recall of 0.84 and 0.88 precision for metastatic status detection, and 0.89 recall and 0.93 precision for metastasis site detection. This study demonstrates the feasibility of applying NLP technologies to extract valuable metastases information from pathology reports and we believe that it will greatly benefit studies on cancer metastases that utilize EHRs. American Medical Informatics Association 2017-07-26 /pmc/articles/PMC5543353/ /pubmed/28815141 Text en ©2017 AMIA - All rights reserved. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose |
spellingShingle | Articles Soysal, Ergin Warner, Jeremy L Denny, Joshua C Xu, Hua Identifying Metastases-related Information from Pathology Reports of Lung Cancer Patients |
title | Identifying Metastases-related Information from Pathology Reports of Lung Cancer Patients |
title_full | Identifying Metastases-related Information from Pathology Reports of Lung Cancer Patients |
title_fullStr | Identifying Metastases-related Information from Pathology Reports of Lung Cancer Patients |
title_full_unstemmed | Identifying Metastases-related Information from Pathology Reports of Lung Cancer Patients |
title_short | Identifying Metastases-related Information from Pathology Reports of Lung Cancer Patients |
title_sort | identifying metastases-related information from pathology reports of lung cancer patients |
topic | Articles |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5543353/ https://www.ncbi.nlm.nih.gov/pubmed/28815141 |
work_keys_str_mv | AT soysalergin identifyingmetastasesrelatedinformationfrompathologyreportsoflungcancerpatients AT warnerjeremyl identifyingmetastasesrelatedinformationfrompathologyreportsoflungcancerpatients AT dennyjoshuac identifyingmetastasesrelatedinformationfrompathologyreportsoflungcancerpatients AT xuhua identifyingmetastasesrelatedinformationfrompathologyreportsoflungcancerpatients |