Cargando…

A Natural Language Processing–Assisted Extraction System for Gleason Scores: Development and Usability Study

BACKGROUND: Natural language processing (NLP) offers significantly faster variable extraction compared to traditional human extraction but cannot interpret complicated notes as well as humans can. Thus, we hypothesized that an “NLP-assisted” extraction system, which uses humans for complicated notes...

Descripción completa

Detalles Bibliográficos
Autores principales: Yu, Shun, Le, Anh, Feld, Emily, Schriver, Emily, Gabriel, Peter, Doucette, Abigail, Narayan, Vivek, Feldman, Michael, Schwartz, Lauren, Maxwell, Kara, Mowery, Danielle
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8285739/
https://www.ncbi.nlm.nih.gov/pubmed/34255641
http://dx.doi.org/10.2196/27970
_version_ 1783723607729373184
author Yu, Shun
Le, Anh
Feld, Emily
Schriver, Emily
Gabriel, Peter
Doucette, Abigail
Narayan, Vivek
Feldman, Michael
Schwartz, Lauren
Maxwell, Kara
Mowery, Danielle
author_facet Yu, Shun
Le, Anh
Feld, Emily
Schriver, Emily
Gabriel, Peter
Doucette, Abigail
Narayan, Vivek
Feldman, Michael
Schwartz, Lauren
Maxwell, Kara
Mowery, Danielle
author_sort Yu, Shun
collection PubMed
description BACKGROUND: Natural language processing (NLP) offers significantly faster variable extraction compared to traditional human extraction but cannot interpret complicated notes as well as humans can. Thus, we hypothesized that an “NLP-assisted” extraction system, which uses humans for complicated notes and NLP for uncomplicated notes, could produce faster extraction without compromising accuracy. OBJECTIVE: The aim of this study was to develop and pilot an NLP-assisted extraction system to leverage the strengths of both human and NLP extraction of prostate cancer Gleason scores. METHODS: We collected all available clinical and pathology notes for prostate cancer patients in an unselected academic biobank cohort. We developed an NLP system to extract prostate cancer Gleason scores from both clinical and pathology notes. Next, we designed and implemented the NLP-assisted extraction system algorithm to categorize notes into “uncomplicated” and “complicated” notes. Uncomplicated notes were assigned to NLP extraction and complicated notes were assigned to human extraction. We randomly reviewed 200 patients to assess the accuracy and speed of our NLP-assisted extraction system and compared it to NLP extraction alone and human extraction alone. RESULTS: Of the 2051 patients in our cohort, the NLP system extracted a prostate surgery Gleason score from 1147 (55.92%) patients and a prostate biopsy Gleason score from 1624 (79.18%) patients. Our NLP-assisted extraction system had an overall accuracy rate of 98.7%, which was similar to the accuracy of human extraction alone (97.5%; P=.17) and significantly higher than the accuracy of NLP extraction alone (95.3%; P<.001). Moreover, our NLP-assisted extraction system reduced the workload of human extractors by approximately 95%, resulting in an average extraction time of 12.7 seconds per patient (vs 256.1 seconds per patient for human extraction alone). CONCLUSIONS: We demonstrated that an NLP-assisted extraction system was able to achieve much faster Gleason score extraction compared to traditional human extraction without sacrificing accuracy.
format Online
Article
Text
id pubmed-8285739
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-82857392021-08-03 A Natural Language Processing–Assisted Extraction System for Gleason Scores: Development and Usability Study Yu, Shun Le, Anh Feld, Emily Schriver, Emily Gabriel, Peter Doucette, Abigail Narayan, Vivek Feldman, Michael Schwartz, Lauren Maxwell, Kara Mowery, Danielle JMIR Cancer Original Paper BACKGROUND: Natural language processing (NLP) offers significantly faster variable extraction compared to traditional human extraction but cannot interpret complicated notes as well as humans can. Thus, we hypothesized that an “NLP-assisted” extraction system, which uses humans for complicated notes and NLP for uncomplicated notes, could produce faster extraction without compromising accuracy. OBJECTIVE: The aim of this study was to develop and pilot an NLP-assisted extraction system to leverage the strengths of both human and NLP extraction of prostate cancer Gleason scores. METHODS: We collected all available clinical and pathology notes for prostate cancer patients in an unselected academic biobank cohort. We developed an NLP system to extract prostate cancer Gleason scores from both clinical and pathology notes. Next, we designed and implemented the NLP-assisted extraction system algorithm to categorize notes into “uncomplicated” and “complicated” notes. Uncomplicated notes were assigned to NLP extraction and complicated notes were assigned to human extraction. We randomly reviewed 200 patients to assess the accuracy and speed of our NLP-assisted extraction system and compared it to NLP extraction alone and human extraction alone. RESULTS: Of the 2051 patients in our cohort, the NLP system extracted a prostate surgery Gleason score from 1147 (55.92%) patients and a prostate biopsy Gleason score from 1624 (79.18%) patients. Our NLP-assisted extraction system had an overall accuracy rate of 98.7%, which was similar to the accuracy of human extraction alone (97.5%; P=.17) and significantly higher than the accuracy of NLP extraction alone (95.3%; P<.001). Moreover, our NLP-assisted extraction system reduced the workload of human extractors by approximately 95%, resulting in an average extraction time of 12.7 seconds per patient (vs 256.1 seconds per patient for human extraction alone). CONCLUSIONS: We demonstrated that an NLP-assisted extraction system was able to achieve much faster Gleason score extraction compared to traditional human extraction without sacrificing accuracy. JMIR Publications 2021-07-02 /pmc/articles/PMC8285739/ /pubmed/34255641 http://dx.doi.org/10.2196/27970 Text en ©Shun Yu, Anh Le, Emily Feld, Emily Schriver, Peter Gabriel, Abigail Doucette, Vivek Narayan, Michael Feldman, Lauren Schwartz, Kara Maxwell, Danielle Mowery. Originally published in JMIR Cancer (https://cancer.jmir.org), 02.07.2021. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Cancer, is properly cited. The complete bibliographic information, a link to the original publication on https://cancer.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
Yu, Shun
Le, Anh
Feld, Emily
Schriver, Emily
Gabriel, Peter
Doucette, Abigail
Narayan, Vivek
Feldman, Michael
Schwartz, Lauren
Maxwell, Kara
Mowery, Danielle
A Natural Language Processing–Assisted Extraction System for Gleason Scores: Development and Usability Study
title A Natural Language Processing–Assisted Extraction System for Gleason Scores: Development and Usability Study
title_full A Natural Language Processing–Assisted Extraction System for Gleason Scores: Development and Usability Study
title_fullStr A Natural Language Processing–Assisted Extraction System for Gleason Scores: Development and Usability Study
title_full_unstemmed A Natural Language Processing–Assisted Extraction System for Gleason Scores: Development and Usability Study
title_short A Natural Language Processing–Assisted Extraction System for Gleason Scores: Development and Usability Study
title_sort natural language processing–assisted extraction system for gleason scores: development and usability study
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8285739/
https://www.ncbi.nlm.nih.gov/pubmed/34255641
http://dx.doi.org/10.2196/27970
work_keys_str_mv AT yushun anaturallanguageprocessingassistedextractionsystemforgleasonscoresdevelopmentandusabilitystudy
AT leanh anaturallanguageprocessingassistedextractionsystemforgleasonscoresdevelopmentandusabilitystudy
AT feldemily anaturallanguageprocessingassistedextractionsystemforgleasonscoresdevelopmentandusabilitystudy
AT schriveremily anaturallanguageprocessingassistedextractionsystemforgleasonscoresdevelopmentandusabilitystudy
AT gabrielpeter anaturallanguageprocessingassistedextractionsystemforgleasonscoresdevelopmentandusabilitystudy
AT doucetteabigail anaturallanguageprocessingassistedextractionsystemforgleasonscoresdevelopmentandusabilitystudy
AT narayanvivek anaturallanguageprocessingassistedextractionsystemforgleasonscoresdevelopmentandusabilitystudy
AT feldmanmichael anaturallanguageprocessingassistedextractionsystemforgleasonscoresdevelopmentandusabilitystudy
AT schwartzlauren anaturallanguageprocessingassistedextractionsystemforgleasonscoresdevelopmentandusabilitystudy
AT maxwellkara anaturallanguageprocessingassistedextractionsystemforgleasonscoresdevelopmentandusabilitystudy
AT mowerydanielle anaturallanguageprocessingassistedextractionsystemforgleasonscoresdevelopmentandusabilitystudy
AT yushun naturallanguageprocessingassistedextractionsystemforgleasonscoresdevelopmentandusabilitystudy
AT leanh naturallanguageprocessingassistedextractionsystemforgleasonscoresdevelopmentandusabilitystudy
AT feldemily naturallanguageprocessingassistedextractionsystemforgleasonscoresdevelopmentandusabilitystudy
AT schriveremily naturallanguageprocessingassistedextractionsystemforgleasonscoresdevelopmentandusabilitystudy
AT gabrielpeter naturallanguageprocessingassistedextractionsystemforgleasonscoresdevelopmentandusabilitystudy
AT doucetteabigail naturallanguageprocessingassistedextractionsystemforgleasonscoresdevelopmentandusabilitystudy
AT narayanvivek naturallanguageprocessingassistedextractionsystemforgleasonscoresdevelopmentandusabilitystudy
AT feldmanmichael naturallanguageprocessingassistedextractionsystemforgleasonscoresdevelopmentandusabilitystudy
AT schwartzlauren naturallanguageprocessingassistedextractionsystemforgleasonscoresdevelopmentandusabilitystudy
AT maxwellkara naturallanguageprocessingassistedextractionsystemforgleasonscoresdevelopmentandusabilitystudy
AT mowerydanielle naturallanguageprocessingassistedextractionsystemforgleasonscoresdevelopmentandusabilitystudy