Cargando…

Identifying encephalopathy in patients admitted to an intensive care unit: Going beyond structured information using natural language processing

BACKGROUND: Encephalopathy is a severe co-morbid condition in critically ill patients that includes different clinical constellation of neurological symptoms. However, even for the most recognised form, delirium, this medical condition is rarely recorded in structured fields of electronic health rec...

Descripción completa

Detalles Bibliográficos
Autores principales:	Ariño, Helena, Bae, Soo Kyung, Chaturvedi, Jaya, Wang, Tao, Roberts, Angus
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2023
Materias:	Digital Health
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9899891/ https://www.ncbi.nlm.nih.gov/pubmed/36755566 http://dx.doi.org/10.3389/fdgth.2023.1085602

_version_	1784882731214176256
author	Ariño, Helena Bae, Soo Kyung Chaturvedi, Jaya Wang, Tao Roberts, Angus
author_facet	Ariño, Helena Bae, Soo Kyung Chaturvedi, Jaya Wang, Tao Roberts, Angus
author_sort	Ariño, Helena
collection	PubMed
description	BACKGROUND: Encephalopathy is a severe co-morbid condition in critically ill patients that includes different clinical constellation of neurological symptoms. However, even for the most recognised form, delirium, this medical condition is rarely recorded in structured fields of electronic health records precluding large and unbiased retrospective studies. We aimed to identify patients with encephalopathy using a machine learning-based approach over clinical notes in electronic health records. METHODS: We used a list of ICD-9 codes and clinical concepts related to encephalopathy to define a cohort of patients from the MIMIC-III dataset. Clinical notes were annotated with MedCAT and vectorized with a bag-of-word approach or word embedding using clinical concepts normalised to standard nomenclatures as features. Machine learning algorithms (support vector machines and random forest) trained with clinical notes from patients who had a diagnosis of encephalopathy (defined by ICD-9 codes) were used to classify patients with clinical concepts related to encephalopathy in their clinical notes but without any ICD-9 relevant code. A random selection of 50 patients were reviewed by a clinical expert for model validation. RESULTS: Among 46,520 different patients, 7.5% had encephalopathy related ICD-9 codes in all their admissions (group 1, definite encephalopathy), 45% clinical concepts related to encephalopathy only in their clinical notes (group 2, possible encephalopathy) and 38% did not have encephalopathy related concepts neither in structured nor in clinical notes (group 3, non-encephalopathy). Length of stay, mortality rate or number of co-morbid conditions were higher in groups 1 and 2 compared to group 3. The best model to classify patients from group 2 as patients with encephalopathy (SVM using embeddings) had F1 of 85% and predicted 31% patients from group 2 as having encephalopathy with a probability >90%. Validation on new cases found a precision ranging from 92% to 98% depending on the criteria considered. CONCLUSIONS: Natural language processing techniques can leverage relevant clinical information that might help to identify patients with under-recognised clinical disorders such as encephalopathy. In the MIMIC dataset, this approach identifies with high probability thousands of patients that did not have a formal diagnosis in the structured information of the EHR.
format	Online Article Text
id	pubmed-9899891
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-98998912023-02-07 Identifying encephalopathy in patients admitted to an intensive care unit: Going beyond structured information using natural language processing Ariño, Helena Bae, Soo Kyung Chaturvedi, Jaya Wang, Tao Roberts, Angus Front Digit Health Digital Health BACKGROUND: Encephalopathy is a severe co-morbid condition in critically ill patients that includes different clinical constellation of neurological symptoms. However, even for the most recognised form, delirium, this medical condition is rarely recorded in structured fields of electronic health records precluding large and unbiased retrospective studies. We aimed to identify patients with encephalopathy using a machine learning-based approach over clinical notes in electronic health records. METHODS: We used a list of ICD-9 codes and clinical concepts related to encephalopathy to define a cohort of patients from the MIMIC-III dataset. Clinical notes were annotated with MedCAT and vectorized with a bag-of-word approach or word embedding using clinical concepts normalised to standard nomenclatures as features. Machine learning algorithms (support vector machines and random forest) trained with clinical notes from patients who had a diagnosis of encephalopathy (defined by ICD-9 codes) were used to classify patients with clinical concepts related to encephalopathy in their clinical notes but without any ICD-9 relevant code. A random selection of 50 patients were reviewed by a clinical expert for model validation. RESULTS: Among 46,520 different patients, 7.5% had encephalopathy related ICD-9 codes in all their admissions (group 1, definite encephalopathy), 45% clinical concepts related to encephalopathy only in their clinical notes (group 2, possible encephalopathy) and 38% did not have encephalopathy related concepts neither in structured nor in clinical notes (group 3, non-encephalopathy). Length of stay, mortality rate or number of co-morbid conditions were higher in groups 1 and 2 compared to group 3. The best model to classify patients from group 2 as patients with encephalopathy (SVM using embeddings) had F1 of 85% and predicted 31% patients from group 2 as having encephalopathy with a probability >90%. Validation on new cases found a precision ranging from 92% to 98% depending on the criteria considered. CONCLUSIONS: Natural language processing techniques can leverage relevant clinical information that might help to identify patients with under-recognised clinical disorders such as encephalopathy. In the MIMIC dataset, this approach identifies with high probability thousands of patients that did not have a formal diagnosis in the structured information of the EHR. Frontiers Media S.A. 2023-01-23 /pmc/articles/PMC9899891/ /pubmed/36755566 http://dx.doi.org/10.3389/fdgth.2023.1085602 Text en © 2023 Ariño, Bae, Chaturvedi, Wang and Roberts. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) (https://creativecommons.org/licenses/by/4.0/) . The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Digital Health Ariño, Helena Bae, Soo Kyung Chaturvedi, Jaya Wang, Tao Roberts, Angus Identifying encephalopathy in patients admitted to an intensive care unit: Going beyond structured information using natural language processing
title	Identifying encephalopathy in patients admitted to an intensive care unit: Going beyond structured information using natural language processing
title_full	Identifying encephalopathy in patients admitted to an intensive care unit: Going beyond structured information using natural language processing
title_fullStr	Identifying encephalopathy in patients admitted to an intensive care unit: Going beyond structured information using natural language processing
title_full_unstemmed	Identifying encephalopathy in patients admitted to an intensive care unit: Going beyond structured information using natural language processing
title_short	Identifying encephalopathy in patients admitted to an intensive care unit: Going beyond structured information using natural language processing
title_sort	identifying encephalopathy in patients admitted to an intensive care unit: going beyond structured information using natural language processing
topic	Digital Health
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9899891/ https://www.ncbi.nlm.nih.gov/pubmed/36755566 http://dx.doi.org/10.3389/fdgth.2023.1085602
work_keys_str_mv	AT arinohelena identifyingencephalopathyinpatientsadmittedtoanintensivecareunitgoingbeyondstructuredinformationusingnaturallanguageprocessing AT baesookyung identifyingencephalopathyinpatientsadmittedtoanintensivecareunitgoingbeyondstructuredinformationusingnaturallanguageprocessing AT chaturvedijaya identifyingencephalopathyinpatientsadmittedtoanintensivecareunitgoingbeyondstructuredinformationusingnaturallanguageprocessing AT wangtao identifyingencephalopathyinpatientsadmittedtoanintensivecareunitgoingbeyondstructuredinformationusingnaturallanguageprocessing AT robertsangus identifyingencephalopathyinpatientsadmittedtoanintensivecareunitgoingbeyondstructuredinformationusingnaturallanguageprocessing

Identifying encephalopathy in patients admitted to an intensive care unit: Going beyond structured information using natural language processing

Ejemplares similares