Cargando…

A hybrid method based on semi-supervised learning for relation extraction in Chinese EMRs

BACKGROUND: Building a large-scale medical knowledge graphs needs to automatically extract the relations between entities from electronic medical records (EMRs) . The main challenges are the scarcity of available labeled corpus and the identification of complexity semantic relations in text of Chine...

Descripción completa

Detalles Bibliográficos
Autores principales: Yang, Chunming, Xiao, Dan, Luo, Yuanyuan, Li, Bo, Zhao, Xujian, Zhang, Hui
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9235238/
https://www.ncbi.nlm.nih.gov/pubmed/35761319
http://dx.doi.org/10.1186/s12911-022-01908-4
_version_ 1784736271155855360
author Yang, Chunming
Xiao, Dan
Luo, Yuanyuan
Li, Bo
Zhao, Xujian
Zhang, Hui
author_facet Yang, Chunming
Xiao, Dan
Luo, Yuanyuan
Li, Bo
Zhao, Xujian
Zhang, Hui
author_sort Yang, Chunming
collection PubMed
description BACKGROUND: Building a large-scale medical knowledge graphs needs to automatically extract the relations between entities from electronic medical records (EMRs) . The main challenges are the scarcity of available labeled corpus and the identification of complexity semantic relations in text of Chinese EMRs. A hybrid method based on semi-supervised learning is proposed to extract the medical entity relations from small-scale complex Chinese EMRs. METHODS: The semantic features of sentences are extracted by a residual network and the long dependent information is captured by bidirectional gated recurrent unit. Then the attention mechanism is used to assign weights for the extracted features respectively, and the output of two attention mechanisms is integrated for relation prediction. We adjusted the training process with manually annotated small-scale relational corpus and bootstrapping semi-supervised learning algorithm, and continuously expanded the datasets during the training process. RESULTS: We constructed a small corpus of Chinese EMRs relation extraction based on the EMR datasets released at the China Conference on Knowledge Graph and Semantic Computing. The experimental results show that the best F1-score of the proposed method on the overall relation categories reaches 89.78%, which is 13.07% higher than the baseline CNN.
format Online
Article
Text
id pubmed-9235238
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-92352382022-06-28 A hybrid method based on semi-supervised learning for relation extraction in Chinese EMRs Yang, Chunming Xiao, Dan Luo, Yuanyuan Li, Bo Zhao, Xujian Zhang, Hui BMC Med Inform Decis Mak Research BACKGROUND: Building a large-scale medical knowledge graphs needs to automatically extract the relations between entities from electronic medical records (EMRs) . The main challenges are the scarcity of available labeled corpus and the identification of complexity semantic relations in text of Chinese EMRs. A hybrid method based on semi-supervised learning is proposed to extract the medical entity relations from small-scale complex Chinese EMRs. METHODS: The semantic features of sentences are extracted by a residual network and the long dependent information is captured by bidirectional gated recurrent unit. Then the attention mechanism is used to assign weights for the extracted features respectively, and the output of two attention mechanisms is integrated for relation prediction. We adjusted the training process with manually annotated small-scale relational corpus and bootstrapping semi-supervised learning algorithm, and continuously expanded the datasets during the training process. RESULTS: We constructed a small corpus of Chinese EMRs relation extraction based on the EMR datasets released at the China Conference on Knowledge Graph and Semantic Computing. The experimental results show that the best F1-score of the proposed method on the overall relation categories reaches 89.78%, which is 13.07% higher than the baseline CNN. BioMed Central 2022-06-27 /pmc/articles/PMC9235238/ /pubmed/35761319 http://dx.doi.org/10.1186/s12911-022-01908-4 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Yang, Chunming
Xiao, Dan
Luo, Yuanyuan
Li, Bo
Zhao, Xujian
Zhang, Hui
A hybrid method based on semi-supervised learning for relation extraction in Chinese EMRs
title A hybrid method based on semi-supervised learning for relation extraction in Chinese EMRs
title_full A hybrid method based on semi-supervised learning for relation extraction in Chinese EMRs
title_fullStr A hybrid method based on semi-supervised learning for relation extraction in Chinese EMRs
title_full_unstemmed A hybrid method based on semi-supervised learning for relation extraction in Chinese EMRs
title_short A hybrid method based on semi-supervised learning for relation extraction in Chinese EMRs
title_sort hybrid method based on semi-supervised learning for relation extraction in chinese emrs
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9235238/
https://www.ncbi.nlm.nih.gov/pubmed/35761319
http://dx.doi.org/10.1186/s12911-022-01908-4
work_keys_str_mv AT yangchunming ahybridmethodbasedonsemisupervisedlearningforrelationextractioninchineseemrs
AT xiaodan ahybridmethodbasedonsemisupervisedlearningforrelationextractioninchineseemrs
AT luoyuanyuan ahybridmethodbasedonsemisupervisedlearningforrelationextractioninchineseemrs
AT libo ahybridmethodbasedonsemisupervisedlearningforrelationextractioninchineseemrs
AT zhaoxujian ahybridmethodbasedonsemisupervisedlearningforrelationextractioninchineseemrs
AT zhanghui ahybridmethodbasedonsemisupervisedlearningforrelationextractioninchineseemrs
AT yangchunming hybridmethodbasedonsemisupervisedlearningforrelationextractioninchineseemrs
AT xiaodan hybridmethodbasedonsemisupervisedlearningforrelationextractioninchineseemrs
AT luoyuanyuan hybridmethodbasedonsemisupervisedlearningforrelationextractioninchineseemrs
AT libo hybridmethodbasedonsemisupervisedlearningforrelationextractioninchineseemrs
AT zhaoxujian hybridmethodbasedonsemisupervisedlearningforrelationextractioninchineseemrs
AT zhanghui hybridmethodbasedonsemisupervisedlearningforrelationextractioninchineseemrs