Cargando…

Entity and relation extraction from clinical case reports of COVID-19: a natural language processing approach

BACKGROUND: Extracting relevant information about infectious diseases is an essential task. However, a significant obstacle in supporting public health research is the lack of methods for effectively mining large amounts of health data. OBJECTIVE: This study aims to use natural language processing (...

Descripción completa

Detalles Bibliográficos
Autores principales: Raza, Shaina, Schwartz, Brian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9879259/
https://www.ncbi.nlm.nih.gov/pubmed/36703154
http://dx.doi.org/10.1186/s12911-023-02117-3
Descripción
Sumario:BACKGROUND: Extracting relevant information about infectious diseases is an essential task. However, a significant obstacle in supporting public health research is the lack of methods for effectively mining large amounts of health data. OBJECTIVE: This study aims to use natural language processing (NLP) to extract the key information (clinical factors, social determinants of health) from published cases in the literature. METHODS: The proposed framework integrates a data layer for preparing a data cohort from clinical case reports; an NLP layer to find the clinical and demographic-named entities and relations in the texts; and an evaluation layer for benchmarking performance and analysis. The focus of this study is to extract valuable information from COVID-19 case reports. RESULTS: The named entity recognition implementation in the NLP layer achieves a performance gain of about 1–3% compared to benchmark methods. Furthermore, even without extensive data labeling, the relation extraction method outperforms benchmark methods in terms of accuracy (by 1–8% better). A thorough examination reveals the disease’s presence and symptoms prevalence in patients. CONCLUSIONS: A similar approach can be generalized to other infectious diseases. It is worthwhile to use prior knowledge acquired through transfer learning when researching other infectious diseases. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12911-023-02117-3.