Cargando…

Identification of asthma control factor in clinical notes using a hybrid deep learning model

BACKGROUND: There are significant variabilities in guideline-concordant documentation in asthma care. However, assessing clinician’s documentation is not feasible using only structured data but requires labor-intensive chart review of electronic health records (EHRs). A certain guideline element in...

Descripción completa

Detalles Bibliográficos
Autores principales:	Agnikula Kshatriya, Bhavani Singh, Sagheb, Elham, Wi, Chung-Il, Yoon, Jungwon, Seol, Hee Yun, Juhn, Young, Sohn, Sunghwan
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2021
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8579684/ https://www.ncbi.nlm.nih.gov/pubmed/34753481 http://dx.doi.org/10.1186/s12911-021-01633-4

_version_	1784596476874194944
author	Agnikula Kshatriya, Bhavani Singh Sagheb, Elham Wi, Chung-Il Yoon, Jungwon Seol, Hee Yun Juhn, Young Sohn, Sunghwan
author_facet	Agnikula Kshatriya, Bhavani Singh Sagheb, Elham Wi, Chung-Il Yoon, Jungwon Seol, Hee Yun Juhn, Young Sohn, Sunghwan
author_sort	Agnikula Kshatriya, Bhavani Singh
collection	PubMed
description	BACKGROUND: There are significant variabilities in guideline-concordant documentation in asthma care. However, assessing clinician’s documentation is not feasible using only structured data but requires labor-intensive chart review of electronic health records (EHRs). A certain guideline element in asthma control factors, such as review inhaler techniques, requires context understanding to correctly capture from EHR free text. METHODS: The study data consist of two sets: (1) manual chart reviewed data—1039 clinical notes of 300 patients with asthma diagnosis, and (2) weakly labeled data (distant supervision)—27,363 clinical notes from 800 patients with asthma diagnosis. A context-aware language model, Bidirectional Encoder Representations from Transformers (BERT) was developed to identify inhaler techniques in EHR free text. Both original BERT and clinical BioBERT (cBERT) were applied with a cost-sensitivity to deal with imbalanced data. The distant supervision using weak labels by rules was also incorporated to augment the training set and alleviate a costly manual labeling process in the development of a deep learning algorithm. A hybrid approach using post-hoc rules was also explored to fix BERT model errors. The performance of BERT with/without distant supervision, hybrid, and rule-based models were compared in precision, recall, F-score, and accuracy. RESULTS: The BERT models on the original data performed similar to a rule-based model in F1-score (0.837, 0.845, and 0.838 for rules, BERT, and cBERT, respectively). The BERT models with distant supervision produced higher performance (0.853 and 0.880 for BERT and cBERT, respectively) than without distant supervision and a rule-based model. The hybrid models performed best in F1-score of 0.877 and 0.904 over the distant supervision on BERT and cBERT. CONCLUSIONS: The proposed BERT models with distant supervision demonstrated its capability to identify inhaler techniques in EHR free text, and outperformed both the rule-based model and BERT models trained on the original data. With a distant supervision approach, we may alleviate costly manual chart review to generate the large training data required in most deep learning-based models. A hybrid model was able to fix BERT model errors and further improve the performance.
format	Online Article Text
id	pubmed-8579684
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-85796842021-11-10 Identification of asthma control factor in clinical notes using a hybrid deep learning model Agnikula Kshatriya, Bhavani Singh Sagheb, Elham Wi, Chung-Il Yoon, Jungwon Seol, Hee Yun Juhn, Young Sohn, Sunghwan BMC Med Inform Decis Mak Research BACKGROUND: There are significant variabilities in guideline-concordant documentation in asthma care. However, assessing clinician’s documentation is not feasible using only structured data but requires labor-intensive chart review of electronic health records (EHRs). A certain guideline element in asthma control factors, such as review inhaler techniques, requires context understanding to correctly capture from EHR free text. METHODS: The study data consist of two sets: (1) manual chart reviewed data—1039 clinical notes of 300 patients with asthma diagnosis, and (2) weakly labeled data (distant supervision)—27,363 clinical notes from 800 patients with asthma diagnosis. A context-aware language model, Bidirectional Encoder Representations from Transformers (BERT) was developed to identify inhaler techniques in EHR free text. Both original BERT and clinical BioBERT (cBERT) were applied with a cost-sensitivity to deal with imbalanced data. The distant supervision using weak labels by rules was also incorporated to augment the training set and alleviate a costly manual labeling process in the development of a deep learning algorithm. A hybrid approach using post-hoc rules was also explored to fix BERT model errors. The performance of BERT with/without distant supervision, hybrid, and rule-based models were compared in precision, recall, F-score, and accuracy. RESULTS: The BERT models on the original data performed similar to a rule-based model in F1-score (0.837, 0.845, and 0.838 for rules, BERT, and cBERT, respectively). The BERT models with distant supervision produced higher performance (0.853 and 0.880 for BERT and cBERT, respectively) than without distant supervision and a rule-based model. The hybrid models performed best in F1-score of 0.877 and 0.904 over the distant supervision on BERT and cBERT. CONCLUSIONS: The proposed BERT models with distant supervision demonstrated its capability to identify inhaler techniques in EHR free text, and outperformed both the rule-based model and BERT models trained on the original data. With a distant supervision approach, we may alleviate costly manual chart review to generate the large training data required in most deep learning-based models. A hybrid model was able to fix BERT model errors and further improve the performance. BioMed Central 2021-11-09 /pmc/articles/PMC8579684/ /pubmed/34753481 http://dx.doi.org/10.1186/s12911-021-01633-4 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle	Research Agnikula Kshatriya, Bhavani Singh Sagheb, Elham Wi, Chung-Il Yoon, Jungwon Seol, Hee Yun Juhn, Young Sohn, Sunghwan Identification of asthma control factor in clinical notes using a hybrid deep learning model
title	Identification of asthma control factor in clinical notes using a hybrid deep learning model
title_full	Identification of asthma control factor in clinical notes using a hybrid deep learning model
title_fullStr	Identification of asthma control factor in clinical notes using a hybrid deep learning model
title_full_unstemmed	Identification of asthma control factor in clinical notes using a hybrid deep learning model
title_short	Identification of asthma control factor in clinical notes using a hybrid deep learning model
title_sort	identification of asthma control factor in clinical notes using a hybrid deep learning model
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8579684/ https://www.ncbi.nlm.nih.gov/pubmed/34753481 http://dx.doi.org/10.1186/s12911-021-01633-4
work_keys_str_mv	AT agnikulakshatriyabhavanisingh identificationofasthmacontrolfactorinclinicalnotesusingahybriddeeplearningmodel AT saghebelham identificationofasthmacontrolfactorinclinicalnotesusingahybriddeeplearningmodel AT wichungil identificationofasthmacontrolfactorinclinicalnotesusingahybriddeeplearningmodel AT yoonjungwon identificationofasthmacontrolfactorinclinicalnotesusingahybriddeeplearningmodel AT seolheeyun identificationofasthmacontrolfactorinclinicalnotesusingahybriddeeplearningmodel AT juhnyoung identificationofasthmacontrolfactorinclinicalnotesusingahybriddeeplearningmodel AT sohnsunghwan identificationofasthmacontrolfactorinclinicalnotesusingahybriddeeplearningmodel

Identification of asthma control factor in clinical notes using a hybrid deep learning model

Ejemplares similares