Cargando…
Accelerated curation of checkpoint inhibitor-induced colitis cases from electronic health records
OBJECTIVE: Automatically identifying patients at risk of immune checkpoint inhibitor (ICI)-induced colitis allows physicians to improve patientcare. However, predictive models require training data curated from electronic health records (EHR). Our objective is to automatically identify notes documen...
Autores principales: | , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10066800/ https://www.ncbi.nlm.nih.gov/pubmed/37012912 http://dx.doi.org/10.1093/jamiaopen/ooad017 |
_version_ | 1785018333818519552 |
---|---|
author | Rahman, Protiva Ye, Cheng Mittendorf, Kathleen F Lenoue-Newton, Michele Micheel, Christine Wolber, Jan Osterman, Travis Fabbri, Daniel |
author_facet | Rahman, Protiva Ye, Cheng Mittendorf, Kathleen F Lenoue-Newton, Michele Micheel, Christine Wolber, Jan Osterman, Travis Fabbri, Daniel |
author_sort | Rahman, Protiva |
collection | PubMed |
description | OBJECTIVE: Automatically identifying patients at risk of immune checkpoint inhibitor (ICI)-induced colitis allows physicians to improve patientcare. However, predictive models require training data curated from electronic health records (EHR). Our objective is to automatically identify notes documenting ICI-colitis cases to accelerate data curation. MATERIALS AND METHODS: We present a data pipeline to automatically identify ICI-colitis from EHR notes, accelerating chart review. The pipeline relies on BERT, a state-of-the-art natural language processing (NLP) model. The first stage of the pipeline segments long notes using keywords identified through a logistic classifier and applies BERT to identify ICI-colitis notes. The next stage uses a second BERT model tuned to identify false positive notes and remove notes that were likely positive for mentioning colitis as a side-effect. The final stage further accelerates curation by highlighting the colitis-relevant portions of notes. Specifically, we use BERT’s attention scores to find high-density regions describing colitis. RESULTS: The overall pipeline identified colitis notes with 84% precision and reduced the curator note review load by 75%. The segment BERT classifier had a high recall of 0.98, which is crucial to identify the low incidence (<10%) of colitis. DISCUSSION: Curation from EHR notes is a burdensome task, especially when the curation topic is complicated. Methods described in this work are not only useful for ICI colitis but can also be adapted for other domains. CONCLUSION: Our extraction pipeline reduces manual note review load and makes EHR data more accessible for research. |
format | Online Article Text |
id | pubmed-10066800 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-100668002023-04-02 Accelerated curation of checkpoint inhibitor-induced colitis cases from electronic health records Rahman, Protiva Ye, Cheng Mittendorf, Kathleen F Lenoue-Newton, Michele Micheel, Christine Wolber, Jan Osterman, Travis Fabbri, Daniel JAMIA Open Research and Applications OBJECTIVE: Automatically identifying patients at risk of immune checkpoint inhibitor (ICI)-induced colitis allows physicians to improve patientcare. However, predictive models require training data curated from electronic health records (EHR). Our objective is to automatically identify notes documenting ICI-colitis cases to accelerate data curation. MATERIALS AND METHODS: We present a data pipeline to automatically identify ICI-colitis from EHR notes, accelerating chart review. The pipeline relies on BERT, a state-of-the-art natural language processing (NLP) model. The first stage of the pipeline segments long notes using keywords identified through a logistic classifier and applies BERT to identify ICI-colitis notes. The next stage uses a second BERT model tuned to identify false positive notes and remove notes that were likely positive for mentioning colitis as a side-effect. The final stage further accelerates curation by highlighting the colitis-relevant portions of notes. Specifically, we use BERT’s attention scores to find high-density regions describing colitis. RESULTS: The overall pipeline identified colitis notes with 84% precision and reduced the curator note review load by 75%. The segment BERT classifier had a high recall of 0.98, which is crucial to identify the low incidence (<10%) of colitis. DISCUSSION: Curation from EHR notes is a burdensome task, especially when the curation topic is complicated. Methods described in this work are not only useful for ICI colitis but can also be adapted for other domains. CONCLUSION: Our extraction pipeline reduces manual note review load and makes EHR data more accessible for research. Oxford University Press 2023-04-01 /pmc/articles/PMC10066800/ /pubmed/37012912 http://dx.doi.org/10.1093/jamiaopen/ooad017 Text en © The Author(s) 2023. Published by Oxford University Press on behalf of the American Medical Informatics Association. https://creativecommons.org/licenses/by-nc/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Research and Applications Rahman, Protiva Ye, Cheng Mittendorf, Kathleen F Lenoue-Newton, Michele Micheel, Christine Wolber, Jan Osterman, Travis Fabbri, Daniel Accelerated curation of checkpoint inhibitor-induced colitis cases from electronic health records |
title | Accelerated curation of checkpoint inhibitor-induced colitis cases from electronic health records |
title_full | Accelerated curation of checkpoint inhibitor-induced colitis cases from electronic health records |
title_fullStr | Accelerated curation of checkpoint inhibitor-induced colitis cases from electronic health records |
title_full_unstemmed | Accelerated curation of checkpoint inhibitor-induced colitis cases from electronic health records |
title_short | Accelerated curation of checkpoint inhibitor-induced colitis cases from electronic health records |
title_sort | accelerated curation of checkpoint inhibitor-induced colitis cases from electronic health records |
topic | Research and Applications |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10066800/ https://www.ncbi.nlm.nih.gov/pubmed/37012912 http://dx.doi.org/10.1093/jamiaopen/ooad017 |
work_keys_str_mv | AT rahmanprotiva acceleratedcurationofcheckpointinhibitorinducedcolitiscasesfromelectronichealthrecords AT yecheng acceleratedcurationofcheckpointinhibitorinducedcolitiscasesfromelectronichealthrecords AT mittendorfkathleenf acceleratedcurationofcheckpointinhibitorinducedcolitiscasesfromelectronichealthrecords AT lenouenewtonmichele acceleratedcurationofcheckpointinhibitorinducedcolitiscasesfromelectronichealthrecords AT micheelchristine acceleratedcurationofcheckpointinhibitorinducedcolitiscasesfromelectronichealthrecords AT wolberjan acceleratedcurationofcheckpointinhibitorinducedcolitiscasesfromelectronichealthrecords AT ostermantravis acceleratedcurationofcheckpointinhibitorinducedcolitiscasesfromelectronichealthrecords AT fabbridaniel acceleratedcurationofcheckpointinhibitorinducedcolitiscasesfromelectronichealthrecords |