Cargando…

Extraction of entity relations from Chinese medical literature based on multi-scale CRNN

BACKGROUND: Entity relation extraction technology can be used to extract entities and relations from medical literature, and automatically establish professional mapping knowledge domains. The classical text classification model, convolutional neural networks for sentence classification (TEXTCNN), h...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, Tingyin, Wu, Xuehong, Li, Linyi, Li, Jianhua, Feng, Song
Formato: Online Artículo Texto
Lenguaje:English
Publicado: AME Publishing Company 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9347033/
https://www.ncbi.nlm.nih.gov/pubmed/35928762
http://dx.doi.org/10.21037/atm-22-1226
_version_ 1784761775769518080
author Chen, Tingyin
Wu, Xuehong
Li, Linyi
Li, Jianhua
Feng, Song
author_facet Chen, Tingyin
Wu, Xuehong
Li, Linyi
Li, Jianhua
Feng, Song
author_sort Chen, Tingyin
collection PubMed
description BACKGROUND: Entity relation extraction technology can be used to extract entities and relations from medical literature, and automatically establish professional mapping knowledge domains. The classical text classification model, convolutional neural networks for sentence classification (TEXTCNN), has been shown to have good classification performance, but also has a long-distance dependency problem, which is a common problem of convolutional neural networks (CNNs). Recurrent neural networks (RNN) address the long-distance dependency problem but cannot capture text features at a specific scale in the text. METHODS: To solve these problems, this study sought to establish a model with a multi-scale convolutional recurrent neural network for Sentence Classification (TEXTCRNN) to address the deficiencies in the 2 neural network structures. In entity relation extraction, the entity pair is generally composed of a subject and an object, but as the subject in the entity pair of medical literature is always omitted, it is difficult to use this coding method to obtain general entity position information. Thus, we proposed a new coding method to obtain entity position information to re-establish the relationship between subject and object and complete the entity relation extraction. RESULTS: By comparing the benchmark neural network model and 2 typical multi-scale TEXTCRNN models, the TEXTCRNN [bidirectional long- and short-term memory (BiLSTM)] and TEXTCRNN [double-layer stacking gated recurrent unit (GRU)], the results showed that the multi-scale CRNN model had the best F1 value performance, and the TEXTCRNN (double-layer stacking GRU) was more capable of entity relation classification when the same entity word did not belong to the same entity relation. CONCLUSIONS: The experimental results of the entity relation extraction from Pharmacopoeia of the People’s Republic of China—Guidelines for Clinical Drug Use—Volume of Chemical Drugs and Biological Products showed that entity relation extraction could effectively proceed using the new labeling method. Additionally, compared to typical neural network models, including the TEXTCNN, GRU, and BiLSTM, the multi-scale convolutional recurrent neural network structure had advantages across several evaluation indicators.
format Online
Article
Text
id pubmed-9347033
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher AME Publishing Company
record_format MEDLINE/PubMed
spelling pubmed-93470332022-08-03 Extraction of entity relations from Chinese medical literature based on multi-scale CRNN Chen, Tingyin Wu, Xuehong Li, Linyi Li, Jianhua Feng, Song Ann Transl Med Original Article BACKGROUND: Entity relation extraction technology can be used to extract entities and relations from medical literature, and automatically establish professional mapping knowledge domains. The classical text classification model, convolutional neural networks for sentence classification (TEXTCNN), has been shown to have good classification performance, but also has a long-distance dependency problem, which is a common problem of convolutional neural networks (CNNs). Recurrent neural networks (RNN) address the long-distance dependency problem but cannot capture text features at a specific scale in the text. METHODS: To solve these problems, this study sought to establish a model with a multi-scale convolutional recurrent neural network for Sentence Classification (TEXTCRNN) to address the deficiencies in the 2 neural network structures. In entity relation extraction, the entity pair is generally composed of a subject and an object, but as the subject in the entity pair of medical literature is always omitted, it is difficult to use this coding method to obtain general entity position information. Thus, we proposed a new coding method to obtain entity position information to re-establish the relationship between subject and object and complete the entity relation extraction. RESULTS: By comparing the benchmark neural network model and 2 typical multi-scale TEXTCRNN models, the TEXTCRNN [bidirectional long- and short-term memory (BiLSTM)] and TEXTCRNN [double-layer stacking gated recurrent unit (GRU)], the results showed that the multi-scale CRNN model had the best F1 value performance, and the TEXTCRNN (double-layer stacking GRU) was more capable of entity relation classification when the same entity word did not belong to the same entity relation. CONCLUSIONS: The experimental results of the entity relation extraction from Pharmacopoeia of the People’s Republic of China—Guidelines for Clinical Drug Use—Volume of Chemical Drugs and Biological Products showed that entity relation extraction could effectively proceed using the new labeling method. Additionally, compared to typical neural network models, including the TEXTCNN, GRU, and BiLSTM, the multi-scale convolutional recurrent neural network structure had advantages across several evaluation indicators. AME Publishing Company 2022-05 /pmc/articles/PMC9347033/ /pubmed/35928762 http://dx.doi.org/10.21037/atm-22-1226 Text en 2022 Annals of Translational Medicine. All rights reserved. https://creativecommons.org/licenses/by-nc-nd/4.0/Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0/.
spellingShingle Original Article
Chen, Tingyin
Wu, Xuehong
Li, Linyi
Li, Jianhua
Feng, Song
Extraction of entity relations from Chinese medical literature based on multi-scale CRNN
title Extraction of entity relations from Chinese medical literature based on multi-scale CRNN
title_full Extraction of entity relations from Chinese medical literature based on multi-scale CRNN
title_fullStr Extraction of entity relations from Chinese medical literature based on multi-scale CRNN
title_full_unstemmed Extraction of entity relations from Chinese medical literature based on multi-scale CRNN
title_short Extraction of entity relations from Chinese medical literature based on multi-scale CRNN
title_sort extraction of entity relations from chinese medical literature based on multi-scale crnn
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9347033/
https://www.ncbi.nlm.nih.gov/pubmed/35928762
http://dx.doi.org/10.21037/atm-22-1226
work_keys_str_mv AT chentingyin extractionofentityrelationsfromchinesemedicalliteraturebasedonmultiscalecrnn
AT wuxuehong extractionofentityrelationsfromchinesemedicalliteraturebasedonmultiscalecrnn
AT lilinyi extractionofentityrelationsfromchinesemedicalliteraturebasedonmultiscalecrnn
AT lijianhua extractionofentityrelationsfromchinesemedicalliteraturebasedonmultiscalecrnn
AT fengsong extractionofentityrelationsfromchinesemedicalliteraturebasedonmultiscalecrnn