Cargando…
A hybrid method based on semi-supervised learning for relation extraction in Chinese EMRs
BACKGROUND: Building a large-scale medical knowledge graphs needs to automatically extract the relations between entities from electronic medical records (EMRs) . The main challenges are the scarcity of available labeled corpus and the identification of complexity semantic relations in text of Chine...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9235238/ https://www.ncbi.nlm.nih.gov/pubmed/35761319 http://dx.doi.org/10.1186/s12911-022-01908-4 |
_version_ | 1784736271155855360 |
---|---|
author | Yang, Chunming Xiao, Dan Luo, Yuanyuan Li, Bo Zhao, Xujian Zhang, Hui |
author_facet | Yang, Chunming Xiao, Dan Luo, Yuanyuan Li, Bo Zhao, Xujian Zhang, Hui |
author_sort | Yang, Chunming |
collection | PubMed |
description | BACKGROUND: Building a large-scale medical knowledge graphs needs to automatically extract the relations between entities from electronic medical records (EMRs) . The main challenges are the scarcity of available labeled corpus and the identification of complexity semantic relations in text of Chinese EMRs. A hybrid method based on semi-supervised learning is proposed to extract the medical entity relations from small-scale complex Chinese EMRs. METHODS: The semantic features of sentences are extracted by a residual network and the long dependent information is captured by bidirectional gated recurrent unit. Then the attention mechanism is used to assign weights for the extracted features respectively, and the output of two attention mechanisms is integrated for relation prediction. We adjusted the training process with manually annotated small-scale relational corpus and bootstrapping semi-supervised learning algorithm, and continuously expanded the datasets during the training process. RESULTS: We constructed a small corpus of Chinese EMRs relation extraction based on the EMR datasets released at the China Conference on Knowledge Graph and Semantic Computing. The experimental results show that the best F1-score of the proposed method on the overall relation categories reaches 89.78%, which is 13.07% higher than the baseline CNN. |
format | Online Article Text |
id | pubmed-9235238 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-92352382022-06-28 A hybrid method based on semi-supervised learning for relation extraction in Chinese EMRs Yang, Chunming Xiao, Dan Luo, Yuanyuan Li, Bo Zhao, Xujian Zhang, Hui BMC Med Inform Decis Mak Research BACKGROUND: Building a large-scale medical knowledge graphs needs to automatically extract the relations between entities from electronic medical records (EMRs) . The main challenges are the scarcity of available labeled corpus and the identification of complexity semantic relations in text of Chinese EMRs. A hybrid method based on semi-supervised learning is proposed to extract the medical entity relations from small-scale complex Chinese EMRs. METHODS: The semantic features of sentences are extracted by a residual network and the long dependent information is captured by bidirectional gated recurrent unit. Then the attention mechanism is used to assign weights for the extracted features respectively, and the output of two attention mechanisms is integrated for relation prediction. We adjusted the training process with manually annotated small-scale relational corpus and bootstrapping semi-supervised learning algorithm, and continuously expanded the datasets during the training process. RESULTS: We constructed a small corpus of Chinese EMRs relation extraction based on the EMR datasets released at the China Conference on Knowledge Graph and Semantic Computing. The experimental results show that the best F1-score of the proposed method on the overall relation categories reaches 89.78%, which is 13.07% higher than the baseline CNN. BioMed Central 2022-06-27 /pmc/articles/PMC9235238/ /pubmed/35761319 http://dx.doi.org/10.1186/s12911-022-01908-4 Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Yang, Chunming Xiao, Dan Luo, Yuanyuan Li, Bo Zhao, Xujian Zhang, Hui A hybrid method based on semi-supervised learning for relation extraction in Chinese EMRs |
title | A hybrid method based on semi-supervised learning for relation extraction in Chinese EMRs |
title_full | A hybrid method based on semi-supervised learning for relation extraction in Chinese EMRs |
title_fullStr | A hybrid method based on semi-supervised learning for relation extraction in Chinese EMRs |
title_full_unstemmed | A hybrid method based on semi-supervised learning for relation extraction in Chinese EMRs |
title_short | A hybrid method based on semi-supervised learning for relation extraction in Chinese EMRs |
title_sort | hybrid method based on semi-supervised learning for relation extraction in chinese emrs |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9235238/ https://www.ncbi.nlm.nih.gov/pubmed/35761319 http://dx.doi.org/10.1186/s12911-022-01908-4 |
work_keys_str_mv | AT yangchunming ahybridmethodbasedonsemisupervisedlearningforrelationextractioninchineseemrs AT xiaodan ahybridmethodbasedonsemisupervisedlearningforrelationextractioninchineseemrs AT luoyuanyuan ahybridmethodbasedonsemisupervisedlearningforrelationextractioninchineseemrs AT libo ahybridmethodbasedonsemisupervisedlearningforrelationextractioninchineseemrs AT zhaoxujian ahybridmethodbasedonsemisupervisedlearningforrelationextractioninchineseemrs AT zhanghui ahybridmethodbasedonsemisupervisedlearningforrelationextractioninchineseemrs AT yangchunming hybridmethodbasedonsemisupervisedlearningforrelationextractioninchineseemrs AT xiaodan hybridmethodbasedonsemisupervisedlearningforrelationextractioninchineseemrs AT luoyuanyuan hybridmethodbasedonsemisupervisedlearningforrelationextractioninchineseemrs AT libo hybridmethodbasedonsemisupervisedlearningforrelationextractioninchineseemrs AT zhaoxujian hybridmethodbasedonsemisupervisedlearningforrelationextractioninchineseemrs AT zhanghui hybridmethodbasedonsemisupervisedlearningforrelationextractioninchineseemrs |