Cargando…

Relation Extraction in Biomedical Texts Based on Multi-Head Attention Model With Syntactic Dependency Feature: Modeling Study

BACKGROUND: With the rapid expansion of biomedical literature, biomedical information extraction has attracted increasing attention from researchers. In particular, relation extraction between 2 entities is a long-term research topic. OBJECTIVE: This study aimed to perform 2 multiclass relation extr...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Yongbin, Hui, Linhu, Zou, Liping, Li, Huyang, Xu, Luo, Wang, Xiaohua, Chua, Stephanie
Formato: Online Artículo Texto
Lenguaje:English
Publicado: JMIR Publications 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9634522/
https://www.ncbi.nlm.nih.gov/pubmed/36264604
http://dx.doi.org/10.2196/41136
_version_ 1784824511487541248
author Li, Yongbin
Hui, Linhu
Zou, Liping
Li, Huyang
Xu, Luo
Wang, Xiaohua
Chua, Stephanie
author_facet Li, Yongbin
Hui, Linhu
Zou, Liping
Li, Huyang
Xu, Luo
Wang, Xiaohua
Chua, Stephanie
author_sort Li, Yongbin
collection PubMed
description BACKGROUND: With the rapid expansion of biomedical literature, biomedical information extraction has attracted increasing attention from researchers. In particular, relation extraction between 2 entities is a long-term research topic. OBJECTIVE: This study aimed to perform 2 multiclass relation extraction tasks of Biomedical Natural Language Processing Workshop 2019 Open Shared Tasks: relation extraction of Bacteria-Biotope (BB-rel) task and binary relation extraction of plant seed development (SeeDev-binary) task. In essence, these 2 tasks are aimed at extracting the relation between annotated entity pairs from biomedical texts, which is a challenging problem. METHODS: Traditional research methods adopted feature- or kernel-based methods and achieved good performance. For these tasks, we propose a deep learning model based on a combination of several distributed features, such as domain-specific word embedding, part-of-speech embedding, entity-type embedding, distance embedding, and position embedding. The multi-head attention mechanism is used to extract the global semantic features of an entire sentence. Meanwhile, we introduced a dependency-type feature and the shortest dependency path connecting 2 candidate entities in the syntactic dependency graph to enrich the feature representation. RESULTS: Experiments show that our proposed model has excellent performance in biomedical relation extraction, achieving F(1) scores of 65.56% and 38.04% on the test sets of the BB-rel and SeeDev-binary tasks. Especially in the SeeDev-binary task, the F(1) score of our model is superior to that of other existing models and achieves state-of-the-art performance. CONCLUSIONS: We demonstrated that the multi-head attention mechanism can learn relevant syntactic and semantic features in different representation subspaces and different positions to extract comprehensive feature representation. Moreover, syntactic dependency features can improve the performance of the model by learning dependency relation between the entities in biomedical texts.
format Online
Article
Text
id pubmed-9634522
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher JMIR Publications
record_format MEDLINE/PubMed
spelling pubmed-96345222022-11-05 Relation Extraction in Biomedical Texts Based on Multi-Head Attention Model With Syntactic Dependency Feature: Modeling Study Li, Yongbin Hui, Linhu Zou, Liping Li, Huyang Xu, Luo Wang, Xiaohua Chua, Stephanie JMIR Med Inform Original Paper BACKGROUND: With the rapid expansion of biomedical literature, biomedical information extraction has attracted increasing attention from researchers. In particular, relation extraction between 2 entities is a long-term research topic. OBJECTIVE: This study aimed to perform 2 multiclass relation extraction tasks of Biomedical Natural Language Processing Workshop 2019 Open Shared Tasks: relation extraction of Bacteria-Biotope (BB-rel) task and binary relation extraction of plant seed development (SeeDev-binary) task. In essence, these 2 tasks are aimed at extracting the relation between annotated entity pairs from biomedical texts, which is a challenging problem. METHODS: Traditional research methods adopted feature- or kernel-based methods and achieved good performance. For these tasks, we propose a deep learning model based on a combination of several distributed features, such as domain-specific word embedding, part-of-speech embedding, entity-type embedding, distance embedding, and position embedding. The multi-head attention mechanism is used to extract the global semantic features of an entire sentence. Meanwhile, we introduced a dependency-type feature and the shortest dependency path connecting 2 candidate entities in the syntactic dependency graph to enrich the feature representation. RESULTS: Experiments show that our proposed model has excellent performance in biomedical relation extraction, achieving F(1) scores of 65.56% and 38.04% on the test sets of the BB-rel and SeeDev-binary tasks. Especially in the SeeDev-binary task, the F(1) score of our model is superior to that of other existing models and achieves state-of-the-art performance. CONCLUSIONS: We demonstrated that the multi-head attention mechanism can learn relevant syntactic and semantic features in different representation subspaces and different positions to extract comprehensive feature representation. Moreover, syntactic dependency features can improve the performance of the model by learning dependency relation between the entities in biomedical texts. JMIR Publications 2022-10-20 /pmc/articles/PMC9634522/ /pubmed/36264604 http://dx.doi.org/10.2196/41136 Text en ©Yongbin Li, Linhu Hui, Liping Zou, Huyang Li, Luo Xu, Xiaohua Wang, Stephanie Chua. Originally published in JMIR Medical Informatics (https://medinform.jmir.org), 20.10.2022. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on https://medinform.jmir.org/, as well as this copyright and license information must be included.
spellingShingle Original Paper
Li, Yongbin
Hui, Linhu
Zou, Liping
Li, Huyang
Xu, Luo
Wang, Xiaohua
Chua, Stephanie
Relation Extraction in Biomedical Texts Based on Multi-Head Attention Model With Syntactic Dependency Feature: Modeling Study
title Relation Extraction in Biomedical Texts Based on Multi-Head Attention Model With Syntactic Dependency Feature: Modeling Study
title_full Relation Extraction in Biomedical Texts Based on Multi-Head Attention Model With Syntactic Dependency Feature: Modeling Study
title_fullStr Relation Extraction in Biomedical Texts Based on Multi-Head Attention Model With Syntactic Dependency Feature: Modeling Study
title_full_unstemmed Relation Extraction in Biomedical Texts Based on Multi-Head Attention Model With Syntactic Dependency Feature: Modeling Study
title_short Relation Extraction in Biomedical Texts Based on Multi-Head Attention Model With Syntactic Dependency Feature: Modeling Study
title_sort relation extraction in biomedical texts based on multi-head attention model with syntactic dependency feature: modeling study
topic Original Paper
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9634522/
https://www.ncbi.nlm.nih.gov/pubmed/36264604
http://dx.doi.org/10.2196/41136
work_keys_str_mv AT liyongbin relationextractioninbiomedicaltextsbasedonmultiheadattentionmodelwithsyntacticdependencyfeaturemodelingstudy
AT huilinhu relationextractioninbiomedicaltextsbasedonmultiheadattentionmodelwithsyntacticdependencyfeaturemodelingstudy
AT zouliping relationextractioninbiomedicaltextsbasedonmultiheadattentionmodelwithsyntacticdependencyfeaturemodelingstudy
AT lihuyang relationextractioninbiomedicaltextsbasedonmultiheadattentionmodelwithsyntacticdependencyfeaturemodelingstudy
AT xuluo relationextractioninbiomedicaltextsbasedonmultiheadattentionmodelwithsyntacticdependencyfeaturemodelingstudy
AT wangxiaohua relationextractioninbiomedicaltextsbasedonmultiheadattentionmodelwithsyntacticdependencyfeaturemodelingstudy
AT chuastephanie relationextractioninbiomedicaltextsbasedonmultiheadattentionmodelwithsyntacticdependencyfeaturemodelingstudy