Cargando…

A prefix and attention map discrimination fusion guided attention for biomedical named entity recognition

BACKGROUND: The biomedical literature is growing rapidly, and it is increasingly important to extract meaningful information from the vast amount of literature. Biomedical named entity recognition (BioNER) is one of the key and fundamental tasks in biomedical text mining. It also acts as a primitive...

Descripción completa

Detalles Bibliográficos
Autores principales: Guan, Zhengyi, Zhou, Xiaobing
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9907889/
https://www.ncbi.nlm.nih.gov/pubmed/36755230
http://dx.doi.org/10.1186/s12859-023-05172-9
_version_ 1784884266403889152
author Guan, Zhengyi
Zhou, Xiaobing
author_facet Guan, Zhengyi
Zhou, Xiaobing
author_sort Guan, Zhengyi
collection PubMed
description BACKGROUND: The biomedical literature is growing rapidly, and it is increasingly important to extract meaningful information from the vast amount of literature. Biomedical named entity recognition (BioNER) is one of the key and fundamental tasks in biomedical text mining. It also acts as a primitive step for many downstream applications such as relation extraction and knowledge base completion. Therefore, the accurate identification of entities in biomedical literature has certain research value. However, this task is challenging due to the insufficiency of sequence labeling and the lack of large-scale labeled training data and domain knowledge. RESULTS: In this paper, we use a novel word-pair classification method, design a simple attention mechanism and propose a novel architecture to solve the research difficulties of BioNER more efficiently without leveraging any external knowledge. Specifically, we break down the limitations of sequence labeling-based approaches by predicting the relationship between word pairs. Based on this, we enhance the pre-trained model BioBERT, through the proposed prefix and attention map dscrimination fusion guided attention and propose the E-BioBERT. Our proposed attention differentiates the distribution of different heads in different layers in the BioBERT, which enriches the diversity of self-attention. Our model is superior to state-of-the-art compared models on five available datasets: BC4CHEMD, BC2GM, BC5CDR-Disease, BC5CDR-Chem, and NCBI-Disease, achieving F1-score of 92.55%, 85.45%, 87.53%, 94.16% and 90.55%, respectively. CONCLUSION: Compared with many previous various models, our method does not require additional training datasets, external knowledge, and complex training process. The experimental results on five BioNER benchmark datasets demonstrate that our model is better at mining semantic information, alleviating the problem of label inconsistency, and has higher entity recognition ability. More importantly, we analyze and demonstrate the effectiveness of our proposed attention.
format Online
Article
Text
id pubmed-9907889
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-99078892023-02-09 A prefix and attention map discrimination fusion guided attention for biomedical named entity recognition Guan, Zhengyi Zhou, Xiaobing BMC Bioinformatics Research BACKGROUND: The biomedical literature is growing rapidly, and it is increasingly important to extract meaningful information from the vast amount of literature. Biomedical named entity recognition (BioNER) is one of the key and fundamental tasks in biomedical text mining. It also acts as a primitive step for many downstream applications such as relation extraction and knowledge base completion. Therefore, the accurate identification of entities in biomedical literature has certain research value. However, this task is challenging due to the insufficiency of sequence labeling and the lack of large-scale labeled training data and domain knowledge. RESULTS: In this paper, we use a novel word-pair classification method, design a simple attention mechanism and propose a novel architecture to solve the research difficulties of BioNER more efficiently without leveraging any external knowledge. Specifically, we break down the limitations of sequence labeling-based approaches by predicting the relationship between word pairs. Based on this, we enhance the pre-trained model BioBERT, through the proposed prefix and attention map dscrimination fusion guided attention and propose the E-BioBERT. Our proposed attention differentiates the distribution of different heads in different layers in the BioBERT, which enriches the diversity of self-attention. Our model is superior to state-of-the-art compared models on five available datasets: BC4CHEMD, BC2GM, BC5CDR-Disease, BC5CDR-Chem, and NCBI-Disease, achieving F1-score of 92.55%, 85.45%, 87.53%, 94.16% and 90.55%, respectively. CONCLUSION: Compared with many previous various models, our method does not require additional training datasets, external knowledge, and complex training process. The experimental results on five BioNER benchmark datasets demonstrate that our model is better at mining semantic information, alleviating the problem of label inconsistency, and has higher entity recognition ability. More importantly, we analyze and demonstrate the effectiveness of our proposed attention. BioMed Central 2023-02-08 /pmc/articles/PMC9907889/ /pubmed/36755230 http://dx.doi.org/10.1186/s12859-023-05172-9 Text en © The Author(s) 2023 https://creativecommons.org/licenses/by/4.0/Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/ (https://creativecommons.org/publicdomain/zero/1.0/) ) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Guan, Zhengyi
Zhou, Xiaobing
A prefix and attention map discrimination fusion guided attention for biomedical named entity recognition
title A prefix and attention map discrimination fusion guided attention for biomedical named entity recognition
title_full A prefix and attention map discrimination fusion guided attention for biomedical named entity recognition
title_fullStr A prefix and attention map discrimination fusion guided attention for biomedical named entity recognition
title_full_unstemmed A prefix and attention map discrimination fusion guided attention for biomedical named entity recognition
title_short A prefix and attention map discrimination fusion guided attention for biomedical named entity recognition
title_sort prefix and attention map discrimination fusion guided attention for biomedical named entity recognition
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9907889/
https://www.ncbi.nlm.nih.gov/pubmed/36755230
http://dx.doi.org/10.1186/s12859-023-05172-9
work_keys_str_mv AT guanzhengyi aprefixandattentionmapdiscriminationfusionguidedattentionforbiomedicalnamedentityrecognition
AT zhouxiaobing aprefixandattentionmapdiscriminationfusionguidedattentionforbiomedicalnamedentityrecognition
AT guanzhengyi prefixandattentionmapdiscriminationfusionguidedattentionforbiomedicalnamedentityrecognition
AT zhouxiaobing prefixandattentionmapdiscriminationfusionguidedattentionforbiomedicalnamedentityrecognition