Cargando…

Entity relation extraction in the medical domain: based on data augmentation

BACKGROUND: Entity relation extraction is an important task in the construction of professional knowledge graphs in the medical field. Research on entity relation extraction for academic books in the medical field has revealed that there is a great difference in the number of different entity relati...

Descripción completa

Detalles Bibliográficos
Autores principales:	Wang, Anli, Li, Linyi, Wu, Xuehong, Zhu, Jianping, Yu, Shanshan, Chen, Xi, Li, Jianhua, Zhu, Hongtao
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	AME Publishing Company 2022
Materias:	Original Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9622485/ https://www.ncbi.nlm.nih.gov/pubmed/36330405 http://dx.doi.org/10.21037/atm-22-3991

_version_	1784821779787677696
author	Wang, Anli Li, Linyi Wu, Xuehong Zhu, Jianping Yu, Shanshan Chen, Xi Li, Jianhua Zhu, Hongtao
author_facet	Wang, Anli Li, Linyi Wu, Xuehong Zhu, Jianping Yu, Shanshan Chen, Xi Li, Jianhua Zhu, Hongtao
author_sort	Wang, Anli
collection	PubMed
description	BACKGROUND: Entity relation extraction is an important task in the construction of professional knowledge graphs in the medical field. Research on entity relation extraction for academic books in the medical field has revealed that there is a great difference in the number of different entity relations, which has led to the formation of a typical unbalanced data set that is difficult to recognize but has certain research value. METHODS: In this article, we propose a new entity relation extraction method based on data augmentation. According to the distribution of individual entity relation classes in the data set, the probability of whether a text is augmented during training was calculated. In text-oriented data augmentation, different augmentation methods perform differently in different language environments. The reinforcement of learning determines which data augmentation method to use in the current language environment. This strategy was applied to the entity relation extraction of the medical professional book, Pharmacopoeia of the People’s Republic of China, and different data augmentation methods (i.e., no data augmentation, traditional data augmentation, and reinforcement learning-based data augmentation) were compared under the same neural network model. RESULTS: The deep-learning model using data augmentation was better than the model without data augmentation, as data augmentation significantly improved the evaluation indicators of the relation classes with low data volumes in the unbalanced data set and slightly improved the evaluation indicators of the relation classes with sufficient features and large data volumes. Additionally, the deep-learning model using reinforcement learning-based data augmentation was superior to the deep-learning model using traditional data augmentation. We found that after the application of reinforcement learning-based data augmentation, the evaluation indicators of the multiple relation classes were much better than those to which reinforcement learning-based data augmentation had not been applied. CONCLUSIONS: For unbalanced data sets, data augmentation can effectively improve the ability of the deep-learning model to obtain data features, and reinforcement learning-based data augmentation can further enhance this ability. Our experiments confirmed the superiority of reinforcement learning-based data augmentation.
format	Online Article Text
id	pubmed-9622485
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	AME Publishing Company
record_format	MEDLINE/PubMed
spelling	pubmed-96224852022-11-02 Entity relation extraction in the medical domain: based on data augmentation Wang, Anli Li, Linyi Wu, Xuehong Zhu, Jianping Yu, Shanshan Chen, Xi Li, Jianhua Zhu, Hongtao Ann Transl Med Original Article BACKGROUND: Entity relation extraction is an important task in the construction of professional knowledge graphs in the medical field. Research on entity relation extraction for academic books in the medical field has revealed that there is a great difference in the number of different entity relations, which has led to the formation of a typical unbalanced data set that is difficult to recognize but has certain research value. METHODS: In this article, we propose a new entity relation extraction method based on data augmentation. According to the distribution of individual entity relation classes in the data set, the probability of whether a text is augmented during training was calculated. In text-oriented data augmentation, different augmentation methods perform differently in different language environments. The reinforcement of learning determines which data augmentation method to use in the current language environment. This strategy was applied to the entity relation extraction of the medical professional book, Pharmacopoeia of the People’s Republic of China, and different data augmentation methods (i.e., no data augmentation, traditional data augmentation, and reinforcement learning-based data augmentation) were compared under the same neural network model. RESULTS: The deep-learning model using data augmentation was better than the model without data augmentation, as data augmentation significantly improved the evaluation indicators of the relation classes with low data volumes in the unbalanced data set and slightly improved the evaluation indicators of the relation classes with sufficient features and large data volumes. Additionally, the deep-learning model using reinforcement learning-based data augmentation was superior to the deep-learning model using traditional data augmentation. We found that after the application of reinforcement learning-based data augmentation, the evaluation indicators of the multiple relation classes were much better than those to which reinforcement learning-based data augmentation had not been applied. CONCLUSIONS: For unbalanced data sets, data augmentation can effectively improve the ability of the deep-learning model to obtain data features, and reinforcement learning-based data augmentation can further enhance this ability. Our experiments confirmed the superiority of reinforcement learning-based data augmentation. AME Publishing Company 2022-10 /pmc/articles/PMC9622485/ /pubmed/36330405 http://dx.doi.org/10.21037/atm-22-3991 Text en 2022 Annals of Translational Medicine. All rights reserved. https://creativecommons.org/licenses/by-nc-nd/4.0/Open Access Statement: This is an Open Access article distributed in accordance with the Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International License (CC BY-NC-ND 4.0), which permits the non-commercial replication and distribution of the article with the strict proviso that no changes or edits are made and the original work is properly cited (including links to both the formal publication through the relevant DOI and the license). See: https://creativecommons.org/licenses/by-nc-nd/4.0 (https://creativecommons.org/licenses/by-nc-nd/4.0/) .
spellingShingle	Original Article Wang, Anli Li, Linyi Wu, Xuehong Zhu, Jianping Yu, Shanshan Chen, Xi Li, Jianhua Zhu, Hongtao Entity relation extraction in the medical domain: based on data augmentation
title	Entity relation extraction in the medical domain: based on data augmentation
title_full	Entity relation extraction in the medical domain: based on data augmentation
title_fullStr	Entity relation extraction in the medical domain: based on data augmentation
title_full_unstemmed	Entity relation extraction in the medical domain: based on data augmentation
title_short	Entity relation extraction in the medical domain: based on data augmentation
title_sort	entity relation extraction in the medical domain: based on data augmentation
topic	Original Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9622485/ https://www.ncbi.nlm.nih.gov/pubmed/36330405 http://dx.doi.org/10.21037/atm-22-3991
work_keys_str_mv	AT wanganli entityrelationextractioninthemedicaldomainbasedondataaugmentation AT lilinyi entityrelationextractioninthemedicaldomainbasedondataaugmentation AT wuxuehong entityrelationextractioninthemedicaldomainbasedondataaugmentation AT zhujianping entityrelationextractioninthemedicaldomainbasedondataaugmentation AT yushanshan entityrelationextractioninthemedicaldomainbasedondataaugmentation AT chenxi entityrelationextractioninthemedicaldomainbasedondataaugmentation AT lijianhua entityrelationextractioninthemedicaldomainbasedondataaugmentation AT zhuhongtao entityrelationextractioninthemedicaldomainbasedondataaugmentation

Entity relation extraction in the medical domain: based on data augmentation

Ejemplares similares