Cargando…
The feasibility of using natural language processing to extract clinical information from breast pathology reports
OBJECTIVE: The opportunity to integrate clinical decision support systems into clinical practice is limited due to the lack of structured, machine readable data in the current format of the electronic health record. Natural language processing has been designed to convert free text into machine read...
Autores principales: | , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Medknow Publications & Media Pvt Ltd
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3424662/ https://www.ncbi.nlm.nih.gov/pubmed/22934236 http://dx.doi.org/10.4103/2153-3539.97788 |
_version_ | 1782241245788110848 |
---|---|
author | Buckley, Julliette M. Coopey, Suzanne B. Sharko, John Polubriaginof, Fernanda Drohan, Brian Belli, Ahmet K. Kim, Elizabeth M. H. Garber, Judy E. Smith, Barbara L. Gadd, Michele A. Specht, Michelle C. Roche, Constance A. Gudewicz, Thomas M. Hughes, Kevin S. |
author_facet | Buckley, Julliette M. Coopey, Suzanne B. Sharko, John Polubriaginof, Fernanda Drohan, Brian Belli, Ahmet K. Kim, Elizabeth M. H. Garber, Judy E. Smith, Barbara L. Gadd, Michele A. Specht, Michelle C. Roche, Constance A. Gudewicz, Thomas M. Hughes, Kevin S. |
author_sort | Buckley, Julliette M. |
collection | PubMed |
description | OBJECTIVE: The opportunity to integrate clinical decision support systems into clinical practice is limited due to the lack of structured, machine readable data in the current format of the electronic health record. Natural language processing has been designed to convert free text into machine readable data. The aim of the current study was to ascertain the feasibility of using natural language processing to extract clinical information from >76,000 breast pathology reports. APPROACH AND PROCEDURE: Breast pathology reports from three institutions were analyzed using natural language processing software (Clearforest, Waltham, MA) to extract information on a variety of pathologic diagnoses of interest. Data tables were created from the extracted information according to date of surgery, side of surgery, and medical record number. The variety of ways in which each diagnosis could be represented was recorded, as a means of demonstrating the complexity of machine interpretation of free text. RESULTS: There was widespread variation in how pathologists reported common pathologic diagnoses. We report, for example, 124 ways of saying invasive ductal carcinoma and 95 ways of saying invasive lobular carcinoma. There were >4000 ways of saying invasive ductal carcinoma was not present. Natural language processor sensitivity and specificity were 99.1% and 96.5% when compared to expert human coders. CONCLUSION: We have demonstrated how a large body of free text medical information such as seen in breast pathology reports, can be converted to a machine readable format using natural language processing, and described the inherent complexities of the task. |
format | Online Article Text |
id | pubmed-3424662 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | Medknow Publications & Media Pvt Ltd |
record_format | MEDLINE/PubMed |
spelling | pubmed-34246622012-08-29 The feasibility of using natural language processing to extract clinical information from breast pathology reports Buckley, Julliette M. Coopey, Suzanne B. Sharko, John Polubriaginof, Fernanda Drohan, Brian Belli, Ahmet K. Kim, Elizabeth M. H. Garber, Judy E. Smith, Barbara L. Gadd, Michele A. Specht, Michelle C. Roche, Constance A. Gudewicz, Thomas M. Hughes, Kevin S. J Pathol Inform Technical Note OBJECTIVE: The opportunity to integrate clinical decision support systems into clinical practice is limited due to the lack of structured, machine readable data in the current format of the electronic health record. Natural language processing has been designed to convert free text into machine readable data. The aim of the current study was to ascertain the feasibility of using natural language processing to extract clinical information from >76,000 breast pathology reports. APPROACH AND PROCEDURE: Breast pathology reports from three institutions were analyzed using natural language processing software (Clearforest, Waltham, MA) to extract information on a variety of pathologic diagnoses of interest. Data tables were created from the extracted information according to date of surgery, side of surgery, and medical record number. The variety of ways in which each diagnosis could be represented was recorded, as a means of demonstrating the complexity of machine interpretation of free text. RESULTS: There was widespread variation in how pathologists reported common pathologic diagnoses. We report, for example, 124 ways of saying invasive ductal carcinoma and 95 ways of saying invasive lobular carcinoma. There were >4000 ways of saying invasive ductal carcinoma was not present. Natural language processor sensitivity and specificity were 99.1% and 96.5% when compared to expert human coders. CONCLUSION: We have demonstrated how a large body of free text medical information such as seen in breast pathology reports, can be converted to a machine readable format using natural language processing, and described the inherent complexities of the task. Medknow Publications & Media Pvt Ltd 2012-06-30 /pmc/articles/PMC3424662/ /pubmed/22934236 http://dx.doi.org/10.4103/2153-3539.97788 Text en Copyright: © 2012 Buckley JM. http://creativecommons.org/licenses/by-nc-sa/3.0 This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. |
spellingShingle | Technical Note Buckley, Julliette M. Coopey, Suzanne B. Sharko, John Polubriaginof, Fernanda Drohan, Brian Belli, Ahmet K. Kim, Elizabeth M. H. Garber, Judy E. Smith, Barbara L. Gadd, Michele A. Specht, Michelle C. Roche, Constance A. Gudewicz, Thomas M. Hughes, Kevin S. The feasibility of using natural language processing to extract clinical information from breast pathology reports |
title | The feasibility of using natural language processing to extract clinical information from breast pathology reports |
title_full | The feasibility of using natural language processing to extract clinical information from breast pathology reports |
title_fullStr | The feasibility of using natural language processing to extract clinical information from breast pathology reports |
title_full_unstemmed | The feasibility of using natural language processing to extract clinical information from breast pathology reports |
title_short | The feasibility of using natural language processing to extract clinical information from breast pathology reports |
title_sort | feasibility of using natural language processing to extract clinical information from breast pathology reports |
topic | Technical Note |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3424662/ https://www.ncbi.nlm.nih.gov/pubmed/22934236 http://dx.doi.org/10.4103/2153-3539.97788 |
work_keys_str_mv | AT buckleyjulliettem thefeasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports AT coopeysuzanneb thefeasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports AT sharkojohn thefeasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports AT polubriaginoffernanda thefeasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports AT drohanbrian thefeasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports AT belliahmetk thefeasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports AT kimelizabethmh thefeasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports AT garberjudye thefeasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports AT smithbarbaral thefeasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports AT gaddmichelea thefeasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports AT spechtmichellec thefeasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports AT rocheconstancea thefeasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports AT gudewiczthomasm thefeasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports AT hugheskevins thefeasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports AT buckleyjulliettem feasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports AT coopeysuzanneb feasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports AT sharkojohn feasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports AT polubriaginoffernanda feasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports AT drohanbrian feasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports AT belliahmetk feasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports AT kimelizabethmh feasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports AT garberjudye feasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports AT smithbarbaral feasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports AT gaddmichelea feasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports AT spechtmichellec feasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports AT rocheconstancea feasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports AT gudewiczthomasm feasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports AT hugheskevins feasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports |