Cargando…

An NLP tool for data extraction from electronic health records: COVID-19 mortalities and comorbidities

BACKGROUND: The high infection rate, severe symptoms, and evolving aspects of the COVID-19 pandemic provide challenges for a variety of medical systems around the world. Automatic information retrieval from unstructured text is greatly aided by Natural Language Processing (NLP), the primary approach...

Descripción completa

Detalles Bibliográficos
Autores principales: BuHamra, Sana S., Almutairi, Abdullah N., Buhamrah, Abdullah K., Almadani, Sabah H., Alibrahim, Yusuf A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9751356/
https://www.ncbi.nlm.nih.gov/pubmed/36530667
http://dx.doi.org/10.3389/fpubh.2022.1070870
_version_ 1784850452904411136
author BuHamra, Sana S.
Almutairi, Abdullah N.
Buhamrah, Abdullah K.
Almadani, Sabah H.
Alibrahim, Yusuf A.
author_facet BuHamra, Sana S.
Almutairi, Abdullah N.
Buhamrah, Abdullah K.
Almadani, Sabah H.
Alibrahim, Yusuf A.
author_sort BuHamra, Sana S.
collection PubMed
description BACKGROUND: The high infection rate, severe symptoms, and evolving aspects of the COVID-19 pandemic provide challenges for a variety of medical systems around the world. Automatic information retrieval from unstructured text is greatly aided by Natural Language Processing (NLP), the primary approach taken in this field. This study addresses COVID-19 mortality data from the intensive care unit (ICU) in Kuwait during the first 18 months of the pandemic. A key goal is to extract and classify the primary and intermediate causes of death from electronic health records (EHRs) in a timely way. In addition, comorbid conditions or concurrent diseases were retrieved and analyzed in relation to a variety of causes of mortality. METHOD: An NLP system using the Python programming language is constructed to automate the process of extracting primary and secondary causes of death, as well as comorbidities. The system is capable of handling inaccurate and messy data, this includes inadequate formats, spelling mistakes and mispositioned information. A machine learning decision trees method is used to classify the causes of death. RESULTS: For 54.8% of the 1691 ICU patients we studied, septic shock or sepsis-related multiorgan failure was the leading cause of mortality. About three-quarters of patients die from acute respiratory distress syndrome (ARDS), a common intermediate cause of death. An arrhythmia (AF) disorder was determined to be the strongest predictor of intermediate cause of death, whether caused by ARDS or other causes. CONCLUSION: We created an NLP system to automate the extraction of causes of death and comorbidities from EHRs. Our method processes messy and erroneous data and classifies the primary and intermediate causes of death of COVID-19 patients. We advocate arranging the EHR with well-defined sections and menu-driven options to reduce incorrect forms.
format Online
Article
Text
id pubmed-9751356
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-97513562022-12-16 An NLP tool for data extraction from electronic health records: COVID-19 mortalities and comorbidities BuHamra, Sana S. Almutairi, Abdullah N. Buhamrah, Abdullah K. Almadani, Sabah H. Alibrahim, Yusuf A. Front Public Health Public Health BACKGROUND: The high infection rate, severe symptoms, and evolving aspects of the COVID-19 pandemic provide challenges for a variety of medical systems around the world. Automatic information retrieval from unstructured text is greatly aided by Natural Language Processing (NLP), the primary approach taken in this field. This study addresses COVID-19 mortality data from the intensive care unit (ICU) in Kuwait during the first 18 months of the pandemic. A key goal is to extract and classify the primary and intermediate causes of death from electronic health records (EHRs) in a timely way. In addition, comorbid conditions or concurrent diseases were retrieved and analyzed in relation to a variety of causes of mortality. METHOD: An NLP system using the Python programming language is constructed to automate the process of extracting primary and secondary causes of death, as well as comorbidities. The system is capable of handling inaccurate and messy data, this includes inadequate formats, spelling mistakes and mispositioned information. A machine learning decision trees method is used to classify the causes of death. RESULTS: For 54.8% of the 1691 ICU patients we studied, septic shock or sepsis-related multiorgan failure was the leading cause of mortality. About three-quarters of patients die from acute respiratory distress syndrome (ARDS), a common intermediate cause of death. An arrhythmia (AF) disorder was determined to be the strongest predictor of intermediate cause of death, whether caused by ARDS or other causes. CONCLUSION: We created an NLP system to automate the extraction of causes of death and comorbidities from EHRs. Our method processes messy and erroneous data and classifies the primary and intermediate causes of death of COVID-19 patients. We advocate arranging the EHR with well-defined sections and menu-driven options to reduce incorrect forms. Frontiers Media S.A. 2022-12-01 /pmc/articles/PMC9751356/ /pubmed/36530667 http://dx.doi.org/10.3389/fpubh.2022.1070870 Text en Copyright © 2022 BuHamra, Almutairi, Buhamrah, Almadani and Alibrahim. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Public Health
BuHamra, Sana S.
Almutairi, Abdullah N.
Buhamrah, Abdullah K.
Almadani, Sabah H.
Alibrahim, Yusuf A.
An NLP tool for data extraction from electronic health records: COVID-19 mortalities and comorbidities
title An NLP tool for data extraction from electronic health records: COVID-19 mortalities and comorbidities
title_full An NLP tool for data extraction from electronic health records: COVID-19 mortalities and comorbidities
title_fullStr An NLP tool for data extraction from electronic health records: COVID-19 mortalities and comorbidities
title_full_unstemmed An NLP tool for data extraction from electronic health records: COVID-19 mortalities and comorbidities
title_short An NLP tool for data extraction from electronic health records: COVID-19 mortalities and comorbidities
title_sort nlp tool for data extraction from electronic health records: covid-19 mortalities and comorbidities
topic Public Health
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9751356/
https://www.ncbi.nlm.nih.gov/pubmed/36530667
http://dx.doi.org/10.3389/fpubh.2022.1070870
work_keys_str_mv AT buhamrasanas annlptoolfordataextractionfromelectronichealthrecordscovid19mortalitiesandcomorbidities
AT almutairiabdullahn annlptoolfordataextractionfromelectronichealthrecordscovid19mortalitiesandcomorbidities
AT buhamrahabdullahk annlptoolfordataextractionfromelectronichealthrecordscovid19mortalitiesandcomorbidities
AT almadanisabahh annlptoolfordataextractionfromelectronichealthrecordscovid19mortalitiesandcomorbidities
AT alibrahimyusufa annlptoolfordataextractionfromelectronichealthrecordscovid19mortalitiesandcomorbidities
AT buhamrasanas nlptoolfordataextractionfromelectronichealthrecordscovid19mortalitiesandcomorbidities
AT almutairiabdullahn nlptoolfordataextractionfromelectronichealthrecordscovid19mortalitiesandcomorbidities
AT buhamrahabdullahk nlptoolfordataextractionfromelectronichealthrecordscovid19mortalitiesandcomorbidities
AT almadanisabahh nlptoolfordataextractionfromelectronichealthrecordscovid19mortalitiesandcomorbidities
AT alibrahimyusufa nlptoolfordataextractionfromelectronichealthrecordscovid19mortalitiesandcomorbidities