Cargando…

Accelerated curation of checkpoint inhibitor-induced colitis cases from electronic health records

OBJECTIVE: Automatically identifying patients at risk of immune checkpoint inhibitor (ICI)-induced colitis allows physicians to improve patientcare. However, predictive models require training data curated from electronic health records (EHR). Our objective is to automatically identify notes documen...

Descripción completa

Detalles Bibliográficos
Autores principales: Rahman, Protiva, Ye, Cheng, Mittendorf, Kathleen F, Lenoue-Newton, Michele, Micheel, Christine, Wolber, Jan, Osterman, Travis, Fabbri, Daniel
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10066800/
https://www.ncbi.nlm.nih.gov/pubmed/37012912
http://dx.doi.org/10.1093/jamiaopen/ooad017
_version_ 1785018333818519552
author Rahman, Protiva
Ye, Cheng
Mittendorf, Kathleen F
Lenoue-Newton, Michele
Micheel, Christine
Wolber, Jan
Osterman, Travis
Fabbri, Daniel
author_facet Rahman, Protiva
Ye, Cheng
Mittendorf, Kathleen F
Lenoue-Newton, Michele
Micheel, Christine
Wolber, Jan
Osterman, Travis
Fabbri, Daniel
author_sort Rahman, Protiva
collection PubMed
description OBJECTIVE: Automatically identifying patients at risk of immune checkpoint inhibitor (ICI)-induced colitis allows physicians to improve patientcare. However, predictive models require training data curated from electronic health records (EHR). Our objective is to automatically identify notes documenting ICI-colitis cases to accelerate data curation. MATERIALS AND METHODS: We present a data pipeline to automatically identify ICI-colitis from EHR notes, accelerating chart review. The pipeline relies on BERT, a state-of-the-art natural language processing (NLP) model. The first stage of the pipeline segments long notes using keywords identified through a logistic classifier and applies BERT to identify ICI-colitis notes. The next stage uses a second BERT model tuned to identify false positive notes and remove notes that were likely positive for mentioning colitis as a side-effect. The final stage further accelerates curation by highlighting the colitis-relevant portions of notes. Specifically, we use BERT’s attention scores to find high-density regions describing colitis. RESULTS: The overall pipeline identified colitis notes with 84% precision and reduced the curator note review load by 75%. The segment BERT classifier had a high recall of 0.98, which is crucial to identify the low incidence (<10%) of colitis. DISCUSSION: Curation from EHR notes is a burdensome task, especially when the curation topic is complicated. Methods described in this work are not only useful for ICI colitis but can also be adapted for other domains. CONCLUSION: Our extraction pipeline reduces manual note review load and makes EHR data more accessible for research.
format Online
Article
Text
id pubmed-10066800
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-100668002023-04-02 Accelerated curation of checkpoint inhibitor-induced colitis cases from electronic health records Rahman, Protiva Ye, Cheng Mittendorf, Kathleen F Lenoue-Newton, Michele Micheel, Christine Wolber, Jan Osterman, Travis Fabbri, Daniel JAMIA Open Research and Applications OBJECTIVE: Automatically identifying patients at risk of immune checkpoint inhibitor (ICI)-induced colitis allows physicians to improve patientcare. However, predictive models require training data curated from electronic health records (EHR). Our objective is to automatically identify notes documenting ICI-colitis cases to accelerate data curation. MATERIALS AND METHODS: We present a data pipeline to automatically identify ICI-colitis from EHR notes, accelerating chart review. The pipeline relies on BERT, a state-of-the-art natural language processing (NLP) model. The first stage of the pipeline segments long notes using keywords identified through a logistic classifier and applies BERT to identify ICI-colitis notes. The next stage uses a second BERT model tuned to identify false positive notes and remove notes that were likely positive for mentioning colitis as a side-effect. The final stage further accelerates curation by highlighting the colitis-relevant portions of notes. Specifically, we use BERT’s attention scores to find high-density regions describing colitis. RESULTS: The overall pipeline identified colitis notes with 84% precision and reduced the curator note review load by 75%. The segment BERT classifier had a high recall of 0.98, which is crucial to identify the low incidence (<10%) of colitis. DISCUSSION: Curation from EHR notes is a burdensome task, especially when the curation topic is complicated. Methods described in this work are not only useful for ICI colitis but can also be adapted for other domains. CONCLUSION: Our extraction pipeline reduces manual note review load and makes EHR data more accessible for research. Oxford University Press 2023-04-01 /pmc/articles/PMC10066800/ /pubmed/37012912 http://dx.doi.org/10.1093/jamiaopen/ooad017 Text en © The Author(s) 2023. Published by Oxford University Press on behalf of the American Medical Informatics Association. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Research and Applications
Rahman, Protiva
Ye, Cheng
Mittendorf, Kathleen F
Lenoue-Newton, Michele
Micheel, Christine
Wolber, Jan
Osterman, Travis
Fabbri, Daniel
Accelerated curation of checkpoint inhibitor-induced colitis cases from electronic health records
title Accelerated curation of checkpoint inhibitor-induced colitis cases from electronic health records
title_full Accelerated curation of checkpoint inhibitor-induced colitis cases from electronic health records
title_fullStr Accelerated curation of checkpoint inhibitor-induced colitis cases from electronic health records
title_full_unstemmed Accelerated curation of checkpoint inhibitor-induced colitis cases from electronic health records
title_short Accelerated curation of checkpoint inhibitor-induced colitis cases from electronic health records
title_sort accelerated curation of checkpoint inhibitor-induced colitis cases from electronic health records
topic Research and Applications
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10066800/
https://www.ncbi.nlm.nih.gov/pubmed/37012912
http://dx.doi.org/10.1093/jamiaopen/ooad017
work_keys_str_mv AT rahmanprotiva acceleratedcurationofcheckpointinhibitorinducedcolitiscasesfromelectronichealthrecords
AT yecheng acceleratedcurationofcheckpointinhibitorinducedcolitiscasesfromelectronichealthrecords
AT mittendorfkathleenf acceleratedcurationofcheckpointinhibitorinducedcolitiscasesfromelectronichealthrecords
AT lenouenewtonmichele acceleratedcurationofcheckpointinhibitorinducedcolitiscasesfromelectronichealthrecords
AT micheelchristine acceleratedcurationofcheckpointinhibitorinducedcolitiscasesfromelectronichealthrecords
AT wolberjan acceleratedcurationofcheckpointinhibitorinducedcolitiscasesfromelectronichealthrecords
AT ostermantravis acceleratedcurationofcheckpointinhibitorinducedcolitiscasesfromelectronichealthrecords
AT fabbridaniel acceleratedcurationofcheckpointinhibitorinducedcolitiscasesfromelectronichealthrecords