Cargando…

Exploring the effects of drug, disease, and protein dependencies on biomedical named entity recognition: A comparative analysis

Background: Biomedical named entity recognition is one of the important tasks of biomedical literature mining. With the development of natural language processing technology, many deep learning models are used to extract valuable information from the biomedical literature, which promotes the develop...

Descripción completa

Detalles Bibliográficos
Autores principales: Han, Peifu, Li, Xue, Wang, Xun, Wang, Shuang, Gao, Changnan, Chen, Wenqi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9812568/
https://www.ncbi.nlm.nih.gov/pubmed/36618912
http://dx.doi.org/10.3389/fphar.2022.1020759
_version_ 1784863757073121280
author Han, Peifu
Li, Xue
Wang, Xun
Wang, Shuang
Gao, Changnan
Chen, Wenqi
author_facet Han, Peifu
Li, Xue
Wang, Xun
Wang, Shuang
Gao, Changnan
Chen, Wenqi
author_sort Han, Peifu
collection PubMed
description Background: Biomedical named entity recognition is one of the important tasks of biomedical literature mining. With the development of natural language processing technology, many deep learning models are used to extract valuable information from the biomedical literature, which promotes the development of effective BioNER models. However, for specialized domains with diverse and complex contexts and a richer set of semantically related entity types (e.g., drug molecules, targets, pathways, etc., in the biomedical domain), whether the dependencies of these drugs, diseases, and targets can be helpful still needs to be explored. Method: Providing additional dependency information beyond context, a method based on the graph attention network and BERT pre-training model named MKGAT is proposed to improve BioNER performance in the biomedical domain. To enhance BioNER by using external dependency knowledge, we integrate BERT-processed text embeddings and entity dependencies to construct better entity embedding representations for biomedical named entity recognition. Results: The proposed method obtains competitive accuracy and higher efficiency than the state-of-the-art method on three datasets, namely, NCBI-disease corpus, BC2GM, and BC5CDR-chem, with a precision of 90.71%, 88.19%, and 95.71%, recall of 92.52%, 88.05%, and 95.62%, and F1-scores of 91.61%, 88.12%, and 95.66%, respectively, which performs better than existing methods. Conclusion: Drug, disease, and protein dependencies can allow entities to be better represented in neural networks, thereby improving the performance of BioNER.
format Online
Article
Text
id pubmed-9812568
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-98125682023-01-05 Exploring the effects of drug, disease, and protein dependencies on biomedical named entity recognition: A comparative analysis Han, Peifu Li, Xue Wang, Xun Wang, Shuang Gao, Changnan Chen, Wenqi Front Pharmacol Pharmacology Background: Biomedical named entity recognition is one of the important tasks of biomedical literature mining. With the development of natural language processing technology, many deep learning models are used to extract valuable information from the biomedical literature, which promotes the development of effective BioNER models. However, for specialized domains with diverse and complex contexts and a richer set of semantically related entity types (e.g., drug molecules, targets, pathways, etc., in the biomedical domain), whether the dependencies of these drugs, diseases, and targets can be helpful still needs to be explored. Method: Providing additional dependency information beyond context, a method based on the graph attention network and BERT pre-training model named MKGAT is proposed to improve BioNER performance in the biomedical domain. To enhance BioNER by using external dependency knowledge, we integrate BERT-processed text embeddings and entity dependencies to construct better entity embedding representations for biomedical named entity recognition. Results: The proposed method obtains competitive accuracy and higher efficiency than the state-of-the-art method on three datasets, namely, NCBI-disease corpus, BC2GM, and BC5CDR-chem, with a precision of 90.71%, 88.19%, and 95.71%, recall of 92.52%, 88.05%, and 95.62%, and F1-scores of 91.61%, 88.12%, and 95.66%, respectively, which performs better than existing methods. Conclusion: Drug, disease, and protein dependencies can allow entities to be better represented in neural networks, thereby improving the performance of BioNER. Frontiers Media S.A. 2022-12-21 /pmc/articles/PMC9812568/ /pubmed/36618912 http://dx.doi.org/10.3389/fphar.2022.1020759 Text en Copyright © 2022 Han, Li, Wang, Wang, Gao and Chen. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Pharmacology
Han, Peifu
Li, Xue
Wang, Xun
Wang, Shuang
Gao, Changnan
Chen, Wenqi
Exploring the effects of drug, disease, and protein dependencies on biomedical named entity recognition: A comparative analysis
title Exploring the effects of drug, disease, and protein dependencies on biomedical named entity recognition: A comparative analysis
title_full Exploring the effects of drug, disease, and protein dependencies on biomedical named entity recognition: A comparative analysis
title_fullStr Exploring the effects of drug, disease, and protein dependencies on biomedical named entity recognition: A comparative analysis
title_full_unstemmed Exploring the effects of drug, disease, and protein dependencies on biomedical named entity recognition: A comparative analysis
title_short Exploring the effects of drug, disease, and protein dependencies on biomedical named entity recognition: A comparative analysis
title_sort exploring the effects of drug, disease, and protein dependencies on biomedical named entity recognition: a comparative analysis
topic Pharmacology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9812568/
https://www.ncbi.nlm.nih.gov/pubmed/36618912
http://dx.doi.org/10.3389/fphar.2022.1020759
work_keys_str_mv AT hanpeifu exploringtheeffectsofdrugdiseaseandproteindependenciesonbiomedicalnamedentityrecognitionacomparativeanalysis
AT lixue exploringtheeffectsofdrugdiseaseandproteindependenciesonbiomedicalnamedentityrecognitionacomparativeanalysis
AT wangxun exploringtheeffectsofdrugdiseaseandproteindependenciesonbiomedicalnamedentityrecognitionacomparativeanalysis
AT wangshuang exploringtheeffectsofdrugdiseaseandproteindependenciesonbiomedicalnamedentityrecognitionacomparativeanalysis
AT gaochangnan exploringtheeffectsofdrugdiseaseandproteindependenciesonbiomedicalnamedentityrecognitionacomparativeanalysis
AT chenwenqi exploringtheeffectsofdrugdiseaseandproteindependenciesonbiomedicalnamedentityrecognitionacomparativeanalysis