Cargando…
Few-shot biomedical named entity recognition via knowledge-guided instance generation and prompt contrastive learning
MOTIVATION: Few-shot learning that can effectively perform named entity recognition in low-resource scenarios has raised growing attention, but it has not been widely studied yet in the biomedical field. In contrast to high-resource domains, biomedical named entity recognition (BioNER) often encount...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10444965/ https://www.ncbi.nlm.nih.gov/pubmed/37549065 http://dx.doi.org/10.1093/bioinformatics/btad496 |
_version_ | 1785094070416179200 |
---|---|
author | Chen, Peng Wang, Jian Lin, Hongfei Zhao, Di Yang, Zhihao |
author_facet | Chen, Peng Wang, Jian Lin, Hongfei Zhao, Di Yang, Zhihao |
author_sort | Chen, Peng |
collection | PubMed |
description | MOTIVATION: Few-shot learning that can effectively perform named entity recognition in low-resource scenarios has raised growing attention, but it has not been widely studied yet in the biomedical field. In contrast to high-resource domains, biomedical named entity recognition (BioNER) often encounters limited human-labeled data in real-world scenarios, leading to poor generalization performance when training only a few labeled instances. Recent approaches either leverage cross-domain high-resource data or fine-tune the pre-trained masked language model using limited labeled samples to generate new synthetic data, which is easily stuck in domain shift problems or yields low-quality synthetic data. Therefore, in this article, we study a more realistic scenario, i.e. few-shot learning for BioNER. RESULTS: Leveraging the domain knowledge graph, we propose knowledge-guided instance generation for few-shot BioNER, which generates diverse and novel entities based on similar semantic relations of neighbor nodes. In addition, by introducing question prompt, we cast BioNER as question-answering task and propose prompt contrastive learning to improve the robustness of the model by measuring the mutual information between query–answer pairs. Extensive experiments conducted on various few-shot settings show that the proposed framework achieves superior performance. Particularly, in a low-resource scenario with only 20 samples, our approach substantially outperforms recent state-of-the-art models on four benchmark datasets, achieving an average improvement of up to 7.1% F1. AVAILABILITY AND IMPLEMENTATION: Our source code and data are available at https://github.com/cpmss521/KGPC. |
format | Online Article Text |
id | pubmed-10444965 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-104449652023-08-24 Few-shot biomedical named entity recognition via knowledge-guided instance generation and prompt contrastive learning Chen, Peng Wang, Jian Lin, Hongfei Zhao, Di Yang, Zhihao Bioinformatics Original Paper MOTIVATION: Few-shot learning that can effectively perform named entity recognition in low-resource scenarios has raised growing attention, but it has not been widely studied yet in the biomedical field. In contrast to high-resource domains, biomedical named entity recognition (BioNER) often encounters limited human-labeled data in real-world scenarios, leading to poor generalization performance when training only a few labeled instances. Recent approaches either leverage cross-domain high-resource data or fine-tune the pre-trained masked language model using limited labeled samples to generate new synthetic data, which is easily stuck in domain shift problems or yields low-quality synthetic data. Therefore, in this article, we study a more realistic scenario, i.e. few-shot learning for BioNER. RESULTS: Leveraging the domain knowledge graph, we propose knowledge-guided instance generation for few-shot BioNER, which generates diverse and novel entities based on similar semantic relations of neighbor nodes. In addition, by introducing question prompt, we cast BioNER as question-answering task and propose prompt contrastive learning to improve the robustness of the model by measuring the mutual information between query–answer pairs. Extensive experiments conducted on various few-shot settings show that the proposed framework achieves superior performance. Particularly, in a low-resource scenario with only 20 samples, our approach substantially outperforms recent state-of-the-art models on four benchmark datasets, achieving an average improvement of up to 7.1% F1. AVAILABILITY AND IMPLEMENTATION: Our source code and data are available at https://github.com/cpmss521/KGPC. Oxford University Press 2023-08-07 /pmc/articles/PMC10444965/ /pubmed/37549065 http://dx.doi.org/10.1093/bioinformatics/btad496 Text en © The Author(s) 2023. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Paper Chen, Peng Wang, Jian Lin, Hongfei Zhao, Di Yang, Zhihao Few-shot biomedical named entity recognition via knowledge-guided instance generation and prompt contrastive learning |
title | Few-shot biomedical named entity recognition via knowledge-guided instance generation and prompt contrastive learning |
title_full | Few-shot biomedical named entity recognition via knowledge-guided instance generation and prompt contrastive learning |
title_fullStr | Few-shot biomedical named entity recognition via knowledge-guided instance generation and prompt contrastive learning |
title_full_unstemmed | Few-shot biomedical named entity recognition via knowledge-guided instance generation and prompt contrastive learning |
title_short | Few-shot biomedical named entity recognition via knowledge-guided instance generation and prompt contrastive learning |
title_sort | few-shot biomedical named entity recognition via knowledge-guided instance generation and prompt contrastive learning |
topic | Original Paper |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10444965/ https://www.ncbi.nlm.nih.gov/pubmed/37549065 http://dx.doi.org/10.1093/bioinformatics/btad496 |
work_keys_str_mv | AT chenpeng fewshotbiomedicalnamedentityrecognitionviaknowledgeguidedinstancegenerationandpromptcontrastivelearning AT wangjian fewshotbiomedicalnamedentityrecognitionviaknowledgeguidedinstancegenerationandpromptcontrastivelearning AT linhongfei fewshotbiomedicalnamedentityrecognitionviaknowledgeguidedinstancegenerationandpromptcontrastivelearning AT zhaodi fewshotbiomedicalnamedentityrecognitionviaknowledgeguidedinstancegenerationandpromptcontrastivelearning AT yangzhihao fewshotbiomedicalnamedentityrecognitionviaknowledgeguidedinstancegenerationandpromptcontrastivelearning |