Cargando…

How Do Your Biomedical Named Entity Recognition Models Generalize to Novel Entities?

The number of biomedical literature on new biomedical concepts is rapidly increasing, which necessitates a reliable biomedical named entity recognition (BioNER) model for identifying new and unseen entity mentions. However, it is questionable whether existing models can effectively handle them. In t...

Descripción completa

Detalles Bibliográficos
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	IEEE 2022
Materias:	Biomedical Engineering
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9014470/ https://www.ncbi.nlm.nih.gov/pubmed/35582496 http://dx.doi.org/10.1109/ACCESS.2022.3157854

_version_	1784688202595958784
collection	PubMed
description	The number of biomedical literature on new biomedical concepts is rapidly increasing, which necessitates a reliable biomedical named entity recognition (BioNER) model for identifying new and unseen entity mentions. However, it is questionable whether existing models can effectively handle them. In this work, we systematically analyze the three types of recognition abilities of BioNER models: memorization, synonym generalization, and concept generalization. We find that although current best models achieve state-of-the-art performance on benchmarks based on overall performance, they have limitations in identifying synonyms and new biomedical concepts, indicating they are overestimated in terms of their generalization abilities. We also investigate failure cases of models and identify several difficulties in recognizing unseen mentions in biomedical literature as follows: (1) models tend to exploit dataset biases, which hinders the models’ abilities to generalize, and (2) several biomedical names have novel morphological patterns with weak name regularity, and models fail to recognize them. We apply a statistics-based debiasing method to our problem as a simple remedy and show the improvement in generalization to unseen mentions. We hope that our analyses and findings would be able to facilitate further research into the generalization capabilities of NER models in a domain where their reliability is of utmost importance.
format	Online Article Text
id	pubmed-9014470
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	IEEE
record_format	MEDLINE/PubMed
spelling	pubmed-90144702022-05-13 How Do Your Biomedical Named Entity Recognition Models Generalize to Novel Entities? IEEE Access Biomedical Engineering The number of biomedical literature on new biomedical concepts is rapidly increasing, which necessitates a reliable biomedical named entity recognition (BioNER) model for identifying new and unseen entity mentions. However, it is questionable whether existing models can effectively handle them. In this work, we systematically analyze the three types of recognition abilities of BioNER models: memorization, synonym generalization, and concept generalization. We find that although current best models achieve state-of-the-art performance on benchmarks based on overall performance, they have limitations in identifying synonyms and new biomedical concepts, indicating they are overestimated in terms of their generalization abilities. We also investigate failure cases of models and identify several difficulties in recognizing unseen mentions in biomedical literature as follows: (1) models tend to exploit dataset biases, which hinders the models’ abilities to generalize, and (2) several biomedical names have novel morphological patterns with weak name regularity, and models fail to recognize them. We apply a statistics-based debiasing method to our problem as a simple remedy and show the improvement in generalization to unseen mentions. We hope that our analyses and findings would be able to facilitate further research into the generalization capabilities of NER models in a domain where their reliability is of utmost importance. IEEE 2022-03-08 /pmc/articles/PMC9014470/ /pubmed/35582496 http://dx.doi.org/10.1109/ACCESS.2022.3157854 Text en https://creativecommons.org/licenses/by/4.0/This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
spellingShingle	Biomedical Engineering How Do Your Biomedical Named Entity Recognition Models Generalize to Novel Entities?
title	How Do Your Biomedical Named Entity Recognition Models Generalize to Novel Entities?
title_full	How Do Your Biomedical Named Entity Recognition Models Generalize to Novel Entities?
title_fullStr	How Do Your Biomedical Named Entity Recognition Models Generalize to Novel Entities?
title_full_unstemmed	How Do Your Biomedical Named Entity Recognition Models Generalize to Novel Entities?
title_short	How Do Your Biomedical Named Entity Recognition Models Generalize to Novel Entities?
title_sort	how do your biomedical named entity recognition models generalize to novel entities?
topic	Biomedical Engineering
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9014470/ https://www.ncbi.nlm.nih.gov/pubmed/35582496 http://dx.doi.org/10.1109/ACCESS.2022.3157854
work_keys_str_mv	AT howdoyourbiomedicalnamedentityrecognitionmodelsgeneralizetonovelentities AT howdoyourbiomedicalnamedentityrecognitionmodelsgeneralizetonovelentities

How Do Your Biomedical Named Entity Recognition Models Generalize to Novel Entities?

Ejemplares similares