Cargando…

Identifying Metastases-related Information from Pathology Reports of Lung Cancer Patients

Metastatic patterns of spread at the time of cancer recurrence are one of the most important prognostic factors in estimation of clinical course and survival of the patient. This information is not easily accessible since it’s rarely recorded in a structured format. This paper describes a system for...

Descripción completa

Detalles Bibliográficos
Autores principales: Soysal, Ergin, Warner, Jeremy L, Denny, Joshua C, Xu, Hua
Formato: Online Artículo Texto
Lenguaje:English
Publicado: American Medical Informatics Association 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5543353/
https://www.ncbi.nlm.nih.gov/pubmed/28815141
_version_ 1783255133724868608
author Soysal, Ergin
Warner, Jeremy L
Denny, Joshua C
Xu, Hua
author_facet Soysal, Ergin
Warner, Jeremy L
Denny, Joshua C
Xu, Hua
author_sort Soysal, Ergin
collection PubMed
description Metastatic patterns of spread at the time of cancer recurrence are one of the most important prognostic factors in estimation of clinical course and survival of the patient. This information is not easily accessible since it’s rarely recorded in a structured format. This paper describes a system for categorization of pathology reports by specimen site and the detection of metastatic status within the report. A clinical NLP pipeline was developed using sentence boundary detection, tokenization, section identification, part-of-speech tagger, and chunker with some rule based methods to extract metastasis site and status in combination with five types of information related to tumor metastases: histological type, grade, specimen site, metastatic status indicators and the procedure. The system achieved a recall of 0.84 and 0.88 precision for metastatic status detection, and 0.89 recall and 0.93 precision for metastasis site detection. This study demonstrates the feasibility of applying NLP technologies to extract valuable metastases information from pathology reports and we believe that it will greatly benefit studies on cancer metastases that utilize EHRs.
format Online
Article
Text
id pubmed-5543353
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher American Medical Informatics Association
record_format MEDLINE/PubMed
spelling pubmed-55433532017-08-16 Identifying Metastases-related Information from Pathology Reports of Lung Cancer Patients Soysal, Ergin Warner, Jeremy L Denny, Joshua C Xu, Hua AMIA Jt Summits Transl Sci Proc Articles Metastatic patterns of spread at the time of cancer recurrence are one of the most important prognostic factors in estimation of clinical course and survival of the patient. This information is not easily accessible since it’s rarely recorded in a structured format. This paper describes a system for categorization of pathology reports by specimen site and the detection of metastatic status within the report. A clinical NLP pipeline was developed using sentence boundary detection, tokenization, section identification, part-of-speech tagger, and chunker with some rule based methods to extract metastasis site and status in combination with five types of information related to tumor metastases: histological type, grade, specimen site, metastatic status indicators and the procedure. The system achieved a recall of 0.84 and 0.88 precision for metastatic status detection, and 0.89 recall and 0.93 precision for metastasis site detection. This study demonstrates the feasibility of applying NLP technologies to extract valuable metastases information from pathology reports and we believe that it will greatly benefit studies on cancer metastases that utilize EHRs. American Medical Informatics Association 2017-07-26 /pmc/articles/PMC5543353/ /pubmed/28815141 Text en ©2017 AMIA - All rights reserved. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose
spellingShingle Articles
Soysal, Ergin
Warner, Jeremy L
Denny, Joshua C
Xu, Hua
Identifying Metastases-related Information from Pathology Reports of Lung Cancer Patients
title Identifying Metastases-related Information from Pathology Reports of Lung Cancer Patients
title_full Identifying Metastases-related Information from Pathology Reports of Lung Cancer Patients
title_fullStr Identifying Metastases-related Information from Pathology Reports of Lung Cancer Patients
title_full_unstemmed Identifying Metastases-related Information from Pathology Reports of Lung Cancer Patients
title_short Identifying Metastases-related Information from Pathology Reports of Lung Cancer Patients
title_sort identifying metastases-related information from pathology reports of lung cancer patients
topic Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5543353/
https://www.ncbi.nlm.nih.gov/pubmed/28815141
work_keys_str_mv AT soysalergin identifyingmetastasesrelatedinformationfrompathologyreportsoflungcancerpatients
AT warnerjeremyl identifyingmetastasesrelatedinformationfrompathologyreportsoflungcancerpatients
AT dennyjoshuac identifyingmetastasesrelatedinformationfrompathologyreportsoflungcancerpatients
AT xuhua identifyingmetastasesrelatedinformationfrompathologyreportsoflungcancerpatients