Cargando…

The feasibility of using natural language processing to extract clinical information from breast pathology reports

OBJECTIVE: The opportunity to integrate clinical decision support systems into clinical practice is limited due to the lack of structured, machine readable data in the current format of the electronic health record. Natural language processing has been designed to convert free text into machine read...

Descripción completa

Detalles Bibliográficos
Autores principales: Buckley, Julliette M., Coopey, Suzanne B., Sharko, John, Polubriaginof, Fernanda, Drohan, Brian, Belli, Ahmet K., Kim, Elizabeth M. H., Garber, Judy E., Smith, Barbara L., Gadd, Michele A., Specht, Michelle C., Roche, Constance A., Gudewicz, Thomas M., Hughes, Kevin S.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Medknow Publications & Media Pvt Ltd 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3424662/
https://www.ncbi.nlm.nih.gov/pubmed/22934236
http://dx.doi.org/10.4103/2153-3539.97788
_version_ 1782241245788110848
author Buckley, Julliette M.
Coopey, Suzanne B.
Sharko, John
Polubriaginof, Fernanda
Drohan, Brian
Belli, Ahmet K.
Kim, Elizabeth M. H.
Garber, Judy E.
Smith, Barbara L.
Gadd, Michele A.
Specht, Michelle C.
Roche, Constance A.
Gudewicz, Thomas M.
Hughes, Kevin S.
author_facet Buckley, Julliette M.
Coopey, Suzanne B.
Sharko, John
Polubriaginof, Fernanda
Drohan, Brian
Belli, Ahmet K.
Kim, Elizabeth M. H.
Garber, Judy E.
Smith, Barbara L.
Gadd, Michele A.
Specht, Michelle C.
Roche, Constance A.
Gudewicz, Thomas M.
Hughes, Kevin S.
author_sort Buckley, Julliette M.
collection PubMed
description OBJECTIVE: The opportunity to integrate clinical decision support systems into clinical practice is limited due to the lack of structured, machine readable data in the current format of the electronic health record. Natural language processing has been designed to convert free text into machine readable data. The aim of the current study was to ascertain the feasibility of using natural language processing to extract clinical information from >76,000 breast pathology reports. APPROACH AND PROCEDURE: Breast pathology reports from three institutions were analyzed using natural language processing software (Clearforest, Waltham, MA) to extract information on a variety of pathologic diagnoses of interest. Data tables were created from the extracted information according to date of surgery, side of surgery, and medical record number. The variety of ways in which each diagnosis could be represented was recorded, as a means of demonstrating the complexity of machine interpretation of free text. RESULTS: There was widespread variation in how pathologists reported common pathologic diagnoses. We report, for example, 124 ways of saying invasive ductal carcinoma and 95 ways of saying invasive lobular carcinoma. There were >4000 ways of saying invasive ductal carcinoma was not present. Natural language processor sensitivity and specificity were 99.1% and 96.5% when compared to expert human coders. CONCLUSION: We have demonstrated how a large body of free text medical information such as seen in breast pathology reports, can be converted to a machine readable format using natural language processing, and described the inherent complexities of the task.
format Online
Article
Text
id pubmed-3424662
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Medknow Publications & Media Pvt Ltd
record_format MEDLINE/PubMed
spelling pubmed-34246622012-08-29 The feasibility of using natural language processing to extract clinical information from breast pathology reports Buckley, Julliette M. Coopey, Suzanne B. Sharko, John Polubriaginof, Fernanda Drohan, Brian Belli, Ahmet K. Kim, Elizabeth M. H. Garber, Judy E. Smith, Barbara L. Gadd, Michele A. Specht, Michelle C. Roche, Constance A. Gudewicz, Thomas M. Hughes, Kevin S. J Pathol Inform Technical Note OBJECTIVE: The opportunity to integrate clinical decision support systems into clinical practice is limited due to the lack of structured, machine readable data in the current format of the electronic health record. Natural language processing has been designed to convert free text into machine readable data. The aim of the current study was to ascertain the feasibility of using natural language processing to extract clinical information from >76,000 breast pathology reports. APPROACH AND PROCEDURE: Breast pathology reports from three institutions were analyzed using natural language processing software (Clearforest, Waltham, MA) to extract information on a variety of pathologic diagnoses of interest. Data tables were created from the extracted information according to date of surgery, side of surgery, and medical record number. The variety of ways in which each diagnosis could be represented was recorded, as a means of demonstrating the complexity of machine interpretation of free text. RESULTS: There was widespread variation in how pathologists reported common pathologic diagnoses. We report, for example, 124 ways of saying invasive ductal carcinoma and 95 ways of saying invasive lobular carcinoma. There were >4000 ways of saying invasive ductal carcinoma was not present. Natural language processor sensitivity and specificity were 99.1% and 96.5% when compared to expert human coders. CONCLUSION: We have demonstrated how a large body of free text medical information such as seen in breast pathology reports, can be converted to a machine readable format using natural language processing, and described the inherent complexities of the task. Medknow Publications & Media Pvt Ltd 2012-06-30 /pmc/articles/PMC3424662/ /pubmed/22934236 http://dx.doi.org/10.4103/2153-3539.97788 Text en Copyright: © 2012 Buckley JM. http://creativecommons.org/licenses/by-nc-sa/3.0 This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Technical Note
Buckley, Julliette M.
Coopey, Suzanne B.
Sharko, John
Polubriaginof, Fernanda
Drohan, Brian
Belli, Ahmet K.
Kim, Elizabeth M. H.
Garber, Judy E.
Smith, Barbara L.
Gadd, Michele A.
Specht, Michelle C.
Roche, Constance A.
Gudewicz, Thomas M.
Hughes, Kevin S.
The feasibility of using natural language processing to extract clinical information from breast pathology reports
title The feasibility of using natural language processing to extract clinical information from breast pathology reports
title_full The feasibility of using natural language processing to extract clinical information from breast pathology reports
title_fullStr The feasibility of using natural language processing to extract clinical information from breast pathology reports
title_full_unstemmed The feasibility of using natural language processing to extract clinical information from breast pathology reports
title_short The feasibility of using natural language processing to extract clinical information from breast pathology reports
title_sort feasibility of using natural language processing to extract clinical information from breast pathology reports
topic Technical Note
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3424662/
https://www.ncbi.nlm.nih.gov/pubmed/22934236
http://dx.doi.org/10.4103/2153-3539.97788
work_keys_str_mv AT buckleyjulliettem thefeasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports
AT coopeysuzanneb thefeasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports
AT sharkojohn thefeasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports
AT polubriaginoffernanda thefeasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports
AT drohanbrian thefeasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports
AT belliahmetk thefeasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports
AT kimelizabethmh thefeasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports
AT garberjudye thefeasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports
AT smithbarbaral thefeasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports
AT gaddmichelea thefeasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports
AT spechtmichellec thefeasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports
AT rocheconstancea thefeasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports
AT gudewiczthomasm thefeasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports
AT hugheskevins thefeasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports
AT buckleyjulliettem feasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports
AT coopeysuzanneb feasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports
AT sharkojohn feasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports
AT polubriaginoffernanda feasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports
AT drohanbrian feasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports
AT belliahmetk feasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports
AT kimelizabethmh feasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports
AT garberjudye feasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports
AT smithbarbaral feasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports
AT gaddmichelea feasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports
AT spechtmichellec feasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports
AT rocheconstancea feasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports
AT gudewiczthomasm feasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports
AT hugheskevins feasibilityofusingnaturallanguageprocessingtoextractclinicalinformationfrombreastpathologyreports