Cargando…
Automated Extraction and Classification of Cancer Stage Mentions fromUnstructured Text Fields in a Central Cancer Registry
Cancer stage is one of the most important prognostic parameters in most cancer subtypes. The American Joint Com-mittee on Cancer (AJCC) specifies criteria for staging each cancer type based on tumor characteristics (T), lymph node involvement (N), and tumor metastasis (M) known as TNM staging system...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
American Medical Informatics Association
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5961766/ https://www.ncbi.nlm.nih.gov/pubmed/29888032 |
_version_ | 1783324775524859904 |
---|---|
author | AAlAbdulsalam, Abdulrahman K. Garvin, Jennifer H. Redd, Andrew Carter, Marjorie E. Sweeny, Carol Meystre, Stephane M. |
author_facet | AAlAbdulsalam, Abdulrahman K. Garvin, Jennifer H. Redd, Andrew Carter, Marjorie E. Sweeny, Carol Meystre, Stephane M. |
author_sort | AAlAbdulsalam, Abdulrahman K. |
collection | PubMed |
description | Cancer stage is one of the most important prognostic parameters in most cancer subtypes. The American Joint Com-mittee on Cancer (AJCC) specifies criteria for staging each cancer type based on tumor characteristics (T), lymph node involvement (N), and tumor metastasis (M) known as TNM staging system. Information related to cancer stage is typically recorded in clinical narrative text notes and other informal means of communication in the Electronic Health Record (EHR). As a result, human chart-abstractors (known as certified tumor registrars) have to search through volu-minous amounts of text to extract accurate stage information and resolve discordance between different data sources. This study proposes novel applications of natural language processing and machine learning to automatically extract and classify TNM stage mentions from records at the Utah Cancer Registry. Our results indicate that TNM stages can be extracted and classified automatically with high accuracy (extraction sensitivity: 95.5%–98.4% and classification sensitivity: 83.5%–87%). |
format | Online Article Text |
id | pubmed-5961766 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | American Medical Informatics Association |
record_format | MEDLINE/PubMed |
spelling | pubmed-59617662018-06-08 Automated Extraction and Classification of Cancer Stage Mentions fromUnstructured Text Fields in a Central Cancer Registry AAlAbdulsalam, Abdulrahman K. Garvin, Jennifer H. Redd, Andrew Carter, Marjorie E. Sweeny, Carol Meystre, Stephane M. AMIA Jt Summits Transl Sci Proc Articles Cancer stage is one of the most important prognostic parameters in most cancer subtypes. The American Joint Com-mittee on Cancer (AJCC) specifies criteria for staging each cancer type based on tumor characteristics (T), lymph node involvement (N), and tumor metastasis (M) known as TNM staging system. Information related to cancer stage is typically recorded in clinical narrative text notes and other informal means of communication in the Electronic Health Record (EHR). As a result, human chart-abstractors (known as certified tumor registrars) have to search through volu-minous amounts of text to extract accurate stage information and resolve discordance between different data sources. This study proposes novel applications of natural language processing and machine learning to automatically extract and classify TNM stage mentions from records at the Utah Cancer Registry. Our results indicate that TNM stages can be extracted and classified automatically with high accuracy (extraction sensitivity: 95.5%–98.4% and classification sensitivity: 83.5%–87%). American Medical Informatics Association 2018-05-18 /pmc/articles/PMC5961766/ /pubmed/29888032 Text en ©2018 AMIA - All rights reserved. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose |
spellingShingle | Articles AAlAbdulsalam, Abdulrahman K. Garvin, Jennifer H. Redd, Andrew Carter, Marjorie E. Sweeny, Carol Meystre, Stephane M. Automated Extraction and Classification of Cancer Stage Mentions fromUnstructured Text Fields in a Central Cancer Registry |
title | Automated Extraction and Classification of Cancer Stage Mentions fromUnstructured Text Fields in a Central Cancer Registry |
title_full | Automated Extraction and Classification of Cancer Stage Mentions fromUnstructured Text Fields in a Central Cancer Registry |
title_fullStr | Automated Extraction and Classification of Cancer Stage Mentions fromUnstructured Text Fields in a Central Cancer Registry |
title_full_unstemmed | Automated Extraction and Classification of Cancer Stage Mentions fromUnstructured Text Fields in a Central Cancer Registry |
title_short | Automated Extraction and Classification of Cancer Stage Mentions fromUnstructured Text Fields in a Central Cancer Registry |
title_sort | automated extraction and classification of cancer stage mentions fromunstructured text fields in a central cancer registry |
topic | Articles |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5961766/ https://www.ncbi.nlm.nih.gov/pubmed/29888032 |
work_keys_str_mv | AT aalabdulsalamabdulrahmank automatedextractionandclassificationofcancerstagementionsfromunstructuredtextfieldsinacentralcancerregistry AT garvinjenniferh automatedextractionandclassificationofcancerstagementionsfromunstructuredtextfieldsinacentralcancerregistry AT reddandrew automatedextractionandclassificationofcancerstagementionsfromunstructuredtextfieldsinacentralcancerregistry AT cartermarjoriee automatedextractionandclassificationofcancerstagementionsfromunstructuredtextfieldsinacentralcancerregistry AT sweenycarol automatedextractionandclassificationofcancerstagementionsfromunstructuredtextfieldsinacentralcancerregistry AT meystrestephanem automatedextractionandclassificationofcancerstagementionsfromunstructuredtextfieldsinacentralcancerregistry |