Cargando…

Weakly supervised natural language processing for assessing patient-centered outcome following prostate cancer treatment

BACKGROUND: The population-based assessment of patient-centered outcomes (PCOs) has been limited by the efficient and accurate collection of these data. Natural language processing (NLP) pipelines can determine whether a clinical note within an electronic medical record contains evidence on these da...

Descripción completa

Detalles Bibliográficos
Autores principales:	Banerjee, Imon, Li, Kevin, Seneviratne, Martin, Ferrari, Michelle, Seto, Tina, Brooks, James D, Rubin, Daniel L, Hernandez-Boussard, Tina
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Oxford University Press 2019
Materias:	Research and Applications
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6482003/ https://www.ncbi.nlm.nih.gov/pubmed/31032481 http://dx.doi.org/10.1093/jamiaopen/ooy057

_version_	1783413818239483904
author	Banerjee, Imon Li, Kevin Seneviratne, Martin Ferrari, Michelle Seto, Tina Brooks, James D Rubin, Daniel L Hernandez-Boussard, Tina
author_facet	Banerjee, Imon Li, Kevin Seneviratne, Martin Ferrari, Michelle Seto, Tina Brooks, James D Rubin, Daniel L Hernandez-Boussard, Tina
author_sort	Banerjee, Imon
collection	PubMed
description	BACKGROUND: The population-based assessment of patient-centered outcomes (PCOs) has been limited by the efficient and accurate collection of these data. Natural language processing (NLP) pipelines can determine whether a clinical note within an electronic medical record contains evidence on these data. We present and demonstrate the accuracy of an NLP pipeline that targets to assess the presence, absence, or risk discussion of two important PCOs following prostate cancer treatment: urinary incontinence (UI) and bowel dysfunction (BD). METHODS: We propose a weakly supervised NLP approach which annotates electronic medical record clinical notes without requiring manual chart review. A weighted function of neural word embedding was used to create a sentence-level vector representation of relevant expressions extracted from the clinical notes. Sentence vectors were used as input for a multinomial logistic model, with output being either presence, absence or risk discussion of UI/BD. The classifier was trained based on automated sentence annotation depending only on domain-specific dictionaries (weak supervision). RESULTS: The model achieved an average F1 score of 0.86 for the sentence-level, three-tier classification task (presence/absence/risk) in both UI and BD. The model also outperformed a pre-existing rule-based model for note-level annotation of UI with significant margin. CONCLUSIONS: We demonstrate a machine learning method to categorize clinical notes based on important PCOs that trains a classifier on sentence vector representations labeled with a domain-specific dictionary, which eliminates the need for manual engineering of linguistic rules or manual chart review for extracting the PCOs. The weakly supervised NLP pipeline showed promising sensitivity and specificity for identifying important PCOs in unstructured clinical text notes compared to rule-based algorithms. TRIAL REGISTRATION: This is a chart review study and approved by Institutional Review Board (IRB).
format	Online Article Text
id	pubmed-6482003
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	Oxford University Press
record_format	MEDLINE/PubMed
spelling	pubmed-64820032019-04-24 Weakly supervised natural language processing for assessing patient-centered outcome following prostate cancer treatment Banerjee, Imon Li, Kevin Seneviratne, Martin Ferrari, Michelle Seto, Tina Brooks, James D Rubin, Daniel L Hernandez-Boussard, Tina JAMIA Open Research and Applications BACKGROUND: The population-based assessment of patient-centered outcomes (PCOs) has been limited by the efficient and accurate collection of these data. Natural language processing (NLP) pipelines can determine whether a clinical note within an electronic medical record contains evidence on these data. We present and demonstrate the accuracy of an NLP pipeline that targets to assess the presence, absence, or risk discussion of two important PCOs following prostate cancer treatment: urinary incontinence (UI) and bowel dysfunction (BD). METHODS: We propose a weakly supervised NLP approach which annotates electronic medical record clinical notes without requiring manual chart review. A weighted function of neural word embedding was used to create a sentence-level vector representation of relevant expressions extracted from the clinical notes. Sentence vectors were used as input for a multinomial logistic model, with output being either presence, absence or risk discussion of UI/BD. The classifier was trained based on automated sentence annotation depending only on domain-specific dictionaries (weak supervision). RESULTS: The model achieved an average F1 score of 0.86 for the sentence-level, three-tier classification task (presence/absence/risk) in both UI and BD. The model also outperformed a pre-existing rule-based model for note-level annotation of UI with significant margin. CONCLUSIONS: We demonstrate a machine learning method to categorize clinical notes based on important PCOs that trains a classifier on sentence vector representations labeled with a domain-specific dictionary, which eliminates the need for manual engineering of linguistic rules or manual chart review for extracting the PCOs. The weakly supervised NLP pipeline showed promising sensitivity and specificity for identifying important PCOs in unstructured clinical text notes compared to rule-based algorithms. TRIAL REGISTRATION: This is a chart review study and approved by Institutional Review Board (IRB). Oxford University Press 2019-01-04 /pmc/articles/PMC6482003/ /pubmed/31032481 http://dx.doi.org/10.1093/jamiaopen/ooy057 Text en © The Author(s) 2019. Published by Oxford University Press on behalf of the American Medical Informatics Association. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle	Research and Applications Banerjee, Imon Li, Kevin Seneviratne, Martin Ferrari, Michelle Seto, Tina Brooks, James D Rubin, Daniel L Hernandez-Boussard, Tina Weakly supervised natural language processing for assessing patient-centered outcome following prostate cancer treatment
title	Weakly supervised natural language processing for assessing patient-centered outcome following prostate cancer treatment
title_full	Weakly supervised natural language processing for assessing patient-centered outcome following prostate cancer treatment
title_fullStr	Weakly supervised natural language processing for assessing patient-centered outcome following prostate cancer treatment
title_full_unstemmed	Weakly supervised natural language processing for assessing patient-centered outcome following prostate cancer treatment
title_short	Weakly supervised natural language processing for assessing patient-centered outcome following prostate cancer treatment
title_sort	weakly supervised natural language processing for assessing patient-centered outcome following prostate cancer treatment
topic	Research and Applications
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6482003/ https://www.ncbi.nlm.nih.gov/pubmed/31032481 http://dx.doi.org/10.1093/jamiaopen/ooy057
work_keys_str_mv	AT banerjeeimon weaklysupervisednaturallanguageprocessingforassessingpatientcenteredoutcomefollowingprostatecancertreatment AT likevin weaklysupervisednaturallanguageprocessingforassessingpatientcenteredoutcomefollowingprostatecancertreatment AT seneviratnemartin weaklysupervisednaturallanguageprocessingforassessingpatientcenteredoutcomefollowingprostatecancertreatment AT ferrarimichelle weaklysupervisednaturallanguageprocessingforassessingpatientcenteredoutcomefollowingprostatecancertreatment AT setotina weaklysupervisednaturallanguageprocessingforassessingpatientcenteredoutcomefollowingprostatecancertreatment AT brooksjamesd weaklysupervisednaturallanguageprocessingforassessingpatientcenteredoutcomefollowingprostatecancertreatment AT rubindaniell weaklysupervisednaturallanguageprocessingforassessingpatientcenteredoutcomefollowingprostatecancertreatment AT hernandezboussardtina weaklysupervisednaturallanguageprocessingforassessingpatientcenteredoutcomefollowingprostatecancertreatment

Weakly supervised natural language processing for assessing patient-centered outcome following prostate cancer treatment

Ejemplares similares