Cargando…
A prefix and attention map discrimination fusion guided attention for biomedical named entity recognition
BACKGROUND: The biomedical literature is growing rapidly, and it is increasingly important to extract meaningful information from the vast amount of literature. Biomedical named entity recognition (BioNER) is one of the key and fundamental tasks in biomedical text mining. It also acts as a primitive...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9907889/ https://www.ncbi.nlm.nih.gov/pubmed/36755230 http://dx.doi.org/10.1186/s12859-023-05172-9 |
_version_ | 1784884266403889152 |
---|---|
author | Guan, Zhengyi Zhou, Xiaobing |
author_facet | Guan, Zhengyi Zhou, Xiaobing |
author_sort | Guan, Zhengyi |
collection | PubMed |
description | BACKGROUND: The biomedical literature is growing rapidly, and it is increasingly important to extract meaningful information from the vast amount of literature. Biomedical named entity recognition (BioNER) is one of the key and fundamental tasks in biomedical text mining. It also acts as a primitive step for many downstream applications such as relation extraction and knowledge base completion. Therefore, the accurate identification of entities in biomedical literature has certain research value. However, this task is challenging due to the insufficiency of sequence labeling and the lack of large-scale labeled training data and domain knowledge. RESULTS: In this paper, we use a novel word-pair classification method, design a simple attention mechanism and propose a novel architecture to solve the research difficulties of BioNER more efficiently without leveraging any external knowledge. Specifically, we break down the limitations of sequence labeling-based approaches by predicting the relationship between word pairs. Based on this, we enhance the pre-trained model BioBERT, through the proposed prefix and attention map dscrimination fusion guided attention and propose the E-BioBERT. Our proposed attention differentiates the distribution of different heads in different layers in the BioBERT, which enriches the diversity of self-attention. Our model is superior to state-of-the-art compared models on five available datasets: BC4CHEMD, BC2GM, BC5CDR-Disease, BC5CDR-Chem, and NCBI-Disease, achieving F1-score of 92.55%, 85.45%, 87.53%, 94.16% and 90.55%, respectively. CONCLUSION: Compared with many previous various models, our method does not require additional training datasets, external knowledge, and complex training process. The experimental results on five BioNER benchmark datasets demonstrate that our model is better at mining semantic information, alleviating the problem of label inconsistency, and has higher entity recognition ability. More importantly, we analyze and demonstrate the effectiveness of our proposed attention. |
format | Online Article Text |
id | pubmed-9907889 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-99078892023-02-09 A prefix and attention map discrimination fusion guided attention for biomedical named entity recognition Guan, Zhengyi Zhou, Xiaobing BMC Bioinformatics Research BACKGROUND: The biomedical literature is growing rapidly, and it is increasingly important to extract meaningful information from the vast amount of literature. Biomedical named entity recognition (BioNER) is one of the key and fundamental tasks in biomedical text mining. It also acts as a primitive step for many downstream applications such as relation extraction and knowledge base completion. Therefore, the accurate identification of entities in biomedical literature has certain research value. However, this task is challenging due to the insufficiency of sequence labeling and the lack of large-scale labeled training data and domain knowledge. RESULTS: In this paper, we use a novel word-pair classification method, design a simple attention mechanism and propose a novel architecture to solve the research difficulties of BioNER more efficiently without leveraging any external knowledge. Specifically, we break down the limitations of sequence labeling-based approaches by predicting the relationship between word pairs. Based on this, we enhance the pre-trained model BioBERT, through the proposed prefix and attention map dscrimination fusion guided attention and propose the E-BioBERT. Our proposed attention differentiates the distribution of different heads in different layers in the BioBERT, which enriches the diversity of self-attention. Our model is superior to state-of-the-art compared models on five available datasets: BC4CHEMD, BC2GM, BC5CDR-Disease, BC5CDR-Chem, and NCBI-Disease, achieving F1-score of 92.55%, 85.45%, 87.53%, 94.16% and 90.55%, respectively. CONCLUSION: Compared with many previous various models, our method does not require additional training datasets, external knowledge, and complex training process. The experimental results on five BioNER benchmark datasets demonstrate that our model is better at mining semantic information, alleviating the problem of label inconsistency, and has higher entity recognition ability. More importantly, we analyze and demonstrate the effectiveness of our proposed attention. BioMed Central 2023-02-08 /pmc/articles/PMC9907889/ /pubmed/36755230 http://dx.doi.org/10.1186/s12859-023-05172-9 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data. |
spellingShingle | Research Guan, Zhengyi Zhou, Xiaobing A prefix and attention map discrimination fusion guided attention for biomedical named entity recognition |
title | A prefix and attention map discrimination fusion guided attention for biomedical named entity recognition |
title_full | A prefix and attention map discrimination fusion guided attention for biomedical named entity recognition |
title_fullStr | A prefix and attention map discrimination fusion guided attention for biomedical named entity recognition |
title_full_unstemmed | A prefix and attention map discrimination fusion guided attention for biomedical named entity recognition |
title_short | A prefix and attention map discrimination fusion guided attention for biomedical named entity recognition |
title_sort | prefix and attention map discrimination fusion guided attention for biomedical named entity recognition |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9907889/ https://www.ncbi.nlm.nih.gov/pubmed/36755230 http://dx.doi.org/10.1186/s12859-023-05172-9 |
work_keys_str_mv | AT guanzhengyi aprefixandattentionmapdiscriminationfusionguidedattentionforbiomedicalnamedentityrecognition AT zhouxiaobing aprefixandattentionmapdiscriminationfusionguidedattentionforbiomedicalnamedentityrecognition AT guanzhengyi prefixandattentionmapdiscriminationfusionguidedattentionforbiomedicalnamedentityrecognition AT zhouxiaobing prefixandattentionmapdiscriminationfusionguidedattentionforbiomedicalnamedentityrecognition |