Cargando…

Development of a natural language processing algorithm to detect chronic cough in electronic health records

BACKGROUND: Chronic cough (CC) is difficult to identify in electronic health records (EHRs) due to the lack of specific diagnostic codes. We developed a natural language processing (NLP) model to identify cough in free-text provider notes in EHRs from multiple health care providers with the objectiv...

Descripción completa

Detalles Bibliográficos
Autores principales: Bali, Vishal, Weaver, Jessica, Turzhitsky, Vladimir, Schelfhout, Jonathan, Paudel, Misti L., Hulbert, Erin, Peterson-Brandt, Jesse, Currie, Anne-Marie Guerra, Bakka, Dylan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9238070/
https://www.ncbi.nlm.nih.gov/pubmed/35764999
http://dx.doi.org/10.1186/s12890-022-02035-6
_version_ 1784736946390564864
author Bali, Vishal
Weaver, Jessica
Turzhitsky, Vladimir
Schelfhout, Jonathan
Paudel, Misti L.
Hulbert, Erin
Peterson-Brandt, Jesse
Currie, Anne-Marie Guerra
Bakka, Dylan
author_facet Bali, Vishal
Weaver, Jessica
Turzhitsky, Vladimir
Schelfhout, Jonathan
Paudel, Misti L.
Hulbert, Erin
Peterson-Brandt, Jesse
Currie, Anne-Marie Guerra
Bakka, Dylan
author_sort Bali, Vishal
collection PubMed
description BACKGROUND: Chronic cough (CC) is difficult to identify in electronic health records (EHRs) due to the lack of specific diagnostic codes. We developed a natural language processing (NLP) model to identify cough in free-text provider notes in EHRs from multiple health care providers with the objective of using the model in a rules-based CC algorithm to identify individuals with CC from EHRs and to describe the demographic and clinical characteristics of individuals with CC. METHODS: This was a retrospective observational study of enrollees in Optum’s Integrated Clinical + Claims Database. Participants were 18–85 years of age with medical and pharmacy health insurance coverage between January 2016 and March 2017. A labeled reference standard data set was constructed by manually annotating 1000 randomly selected provider notes from the EHRs of enrollees with ≥ 1 cough mention. An NLP model was developed to extract positive or negated cough contexts. NLP, cough diagnosis and medications identified cough encounters. Patients with ≥ 3 encounters spanning at least 56 days within 120 days were defined as having CC. RESULTS: The positive predictive value and sensitivity of the NLP algorithm were 0.96 and 0.68, respectively, for positive cough contexts, and 0.96 and 0.84, respectively, for negated cough contexts. Among the 4818 individuals identified as having CC, 37% were identified using NLP-identified cough mentions in provider notes alone, 16% by diagnosis codes and/or written medication orders, and 47% through a combination of provider notes and diagnosis codes/medications. Chronic cough patients were, on average, 61.0 years and 67.0% were female. The most prevalent comorbidities were respiratory infections (75%) and other lower respiratory disease (82%). CONCLUSIONS: Our EHR-based algorithm integrating NLP methodology with structured fields was able to identify a CC population. Machine learning based approaches can therefore aid in patient selection for future CC research studies. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12890-022-02035-6.
format Online
Article
Text
id pubmed-9238070
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-92380702022-06-29 Development of a natural language processing algorithm to detect chronic cough in electronic health records Bali, Vishal Weaver, Jessica Turzhitsky, Vladimir Schelfhout, Jonathan Paudel, Misti L. Hulbert, Erin Peterson-Brandt, Jesse Currie, Anne-Marie Guerra Bakka, Dylan BMC Pulm Med Research Article BACKGROUND: Chronic cough (CC) is difficult to identify in electronic health records (EHRs) due to the lack of specific diagnostic codes. We developed a natural language processing (NLP) model to identify cough in free-text provider notes in EHRs from multiple health care providers with the objective of using the model in a rules-based CC algorithm to identify individuals with CC from EHRs and to describe the demographic and clinical characteristics of individuals with CC. METHODS: This was a retrospective observational study of enrollees in Optum’s Integrated Clinical + Claims Database. Participants were 18–85 years of age with medical and pharmacy health insurance coverage between January 2016 and March 2017. A labeled reference standard data set was constructed by manually annotating 1000 randomly selected provider notes from the EHRs of enrollees with ≥ 1 cough mention. An NLP model was developed to extract positive or negated cough contexts. NLP, cough diagnosis and medications identified cough encounters. Patients with ≥ 3 encounters spanning at least 56 days within 120 days were defined as having CC. RESULTS: The positive predictive value and sensitivity of the NLP algorithm were 0.96 and 0.68, respectively, for positive cough contexts, and 0.96 and 0.84, respectively, for negated cough contexts. Among the 4818 individuals identified as having CC, 37% were identified using NLP-identified cough mentions in provider notes alone, 16% by diagnosis codes and/or written medication orders, and 47% through a combination of provider notes and diagnosis codes/medications. Chronic cough patients were, on average, 61.0 years and 67.0% were female. The most prevalent comorbidities were respiratory infections (75%) and other lower respiratory disease (82%). CONCLUSIONS: Our EHR-based algorithm integrating NLP methodology with structured fields was able to identify a CC population. Machine learning based approaches can therefore aid in patient selection for future CC research studies. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12890-022-02035-6. BioMed Central 2022-06-28 /pmc/articles/PMC9238070/ /pubmed/35764999 http://dx.doi.org/10.1186/s12890-022-02035-6 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research Article
Bali, Vishal
Weaver, Jessica
Turzhitsky, Vladimir
Schelfhout, Jonathan
Paudel, Misti L.
Hulbert, Erin
Peterson-Brandt, Jesse
Currie, Anne-Marie Guerra
Bakka, Dylan
Development of a natural language processing algorithm to detect chronic cough in electronic health records
title Development of a natural language processing algorithm to detect chronic cough in electronic health records
title_full Development of a natural language processing algorithm to detect chronic cough in electronic health records
title_fullStr Development of a natural language processing algorithm to detect chronic cough in electronic health records
title_full_unstemmed Development of a natural language processing algorithm to detect chronic cough in electronic health records
title_short Development of a natural language processing algorithm to detect chronic cough in electronic health records
title_sort development of a natural language processing algorithm to detect chronic cough in electronic health records
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9238070/
https://www.ncbi.nlm.nih.gov/pubmed/35764999
http://dx.doi.org/10.1186/s12890-022-02035-6
work_keys_str_mv AT balivishal developmentofanaturallanguageprocessingalgorithmtodetectchroniccoughinelectronichealthrecords
AT weaverjessica developmentofanaturallanguageprocessingalgorithmtodetectchroniccoughinelectronichealthrecords
AT turzhitskyvladimir developmentofanaturallanguageprocessingalgorithmtodetectchroniccoughinelectronichealthrecords
AT schelfhoutjonathan developmentofanaturallanguageprocessingalgorithmtodetectchroniccoughinelectronichealthrecords
AT paudelmistil developmentofanaturallanguageprocessingalgorithmtodetectchroniccoughinelectronichealthrecords
AT hulberterin developmentofanaturallanguageprocessingalgorithmtodetectchroniccoughinelectronichealthrecords
AT petersonbrandtjesse developmentofanaturallanguageprocessingalgorithmtodetectchroniccoughinelectronichealthrecords
AT currieannemarieguerra developmentofanaturallanguageprocessingalgorithmtodetectchroniccoughinelectronichealthrecords
AT bakkadylan developmentofanaturallanguageprocessingalgorithmtodetectchroniccoughinelectronichealthrecords