Cargando…
Entity and relation extraction from clinical case reports of COVID-19: a natural language processing approach
BACKGROUND: Extracting relevant information about infectious diseases is an essential task. However, a significant obstacle in supporting public health research is the lack of methods for effectively mining large amounts of health data. OBJECTIVE: This study aims to use natural language processing (...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9879259/ https://www.ncbi.nlm.nih.gov/pubmed/36703154 http://dx.doi.org/10.1186/s12911-023-02117-3 |
Sumario: | BACKGROUND: Extracting relevant information about infectious diseases is an essential task. However, a significant obstacle in supporting public health research is the lack of methods for effectively mining large amounts of health data. OBJECTIVE: This study aims to use natural language processing (NLP) to extract the key information (clinical factors, social determinants of health) from published cases in the literature. METHODS: The proposed framework integrates a data layer for preparing a data cohort from clinical case reports; an NLP layer to find the clinical and demographic-named entities and relations in the texts; and an evaluation layer for benchmarking performance and analysis. The focus of this study is to extract valuable information from COVID-19 case reports. RESULTS: The named entity recognition implementation in the NLP layer achieves a performance gain of about 1–3% compared to benchmark methods. Furthermore, even without extensive data labeling, the relation extraction method outperforms benchmark methods in terms of accuracy (by 1–8% better). A thorough examination reveals the disease’s presence and symptoms prevalence in patients. CONCLUSIONS: A similar approach can be generalized to other infectious diseases. It is worthwhile to use prior knowledge acquired through transfer learning when researching other infectious diseases. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12911-023-02117-3. |
---|