Cargando…

Biomedical named entity recognition based on fusion multi-features embedding

BACKGROUND: With the exponential increase in the volume of biomedical literature, text mining tasks are becoming increasingly important in the medical domain. Named entities are the primary identification tasks in text mining, prerequisites and critical parts for building medical domain knowledge gr...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Meijing, Yang, Hao, Liu, Yuxin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: IOS Press 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10258877/
https://www.ncbi.nlm.nih.gov/pubmed/37038786
http://dx.doi.org/10.3233/THC-236011
_version_ 1785057553595498496
author Li, Meijing
Yang, Hao
Liu, Yuxin
author_facet Li, Meijing
Yang, Hao
Liu, Yuxin
author_sort Li, Meijing
collection PubMed
description BACKGROUND: With the exponential increase in the volume of biomedical literature, text mining tasks are becoming increasingly important in the medical domain. Named entities are the primary identification tasks in text mining, prerequisites and critical parts for building medical domain knowledge graphs, medical question and answer systems, medical text classification. OBJECTIVE: The study goal is to recognize biomedical entities effectively by fusing multi-feature embedding. Multiple features provide more comprehensive information so that better predictions can be obtained. METHODS: Firstly, three different kinds of features are generated, including deep contextual word-level features, local char-level features, and part-of-speech features at the word representation layer. The word representation vectors are inputs into BiLSTM as features to obtain the dependency information. Finally, the CRF algorithm is used to learn the features of the state sequences to obtain the global optimal tagging sequences. RESULTS: The experimental results showed that the model outperformed other state-of-the-art methods for all-around performance in six datasets among eight of four biomedical entity types. CONCLUSION: The proposed method has a positive effect on the prediction results. It comprehensively considers the relevant factors of named entity recognition because the semantic information is enhanced by fusing multi-features embedding.
format Online
Article
Text
id pubmed-10258877
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher IOS Press
record_format MEDLINE/PubMed
spelling pubmed-102588772023-06-13 Biomedical named entity recognition based on fusion multi-features embedding Li, Meijing Yang, Hao Liu, Yuxin Technol Health Care Research Article BACKGROUND: With the exponential increase in the volume of biomedical literature, text mining tasks are becoming increasingly important in the medical domain. Named entities are the primary identification tasks in text mining, prerequisites and critical parts for building medical domain knowledge graphs, medical question and answer systems, medical text classification. OBJECTIVE: The study goal is to recognize biomedical entities effectively by fusing multi-feature embedding. Multiple features provide more comprehensive information so that better predictions can be obtained. METHODS: Firstly, three different kinds of features are generated, including deep contextual word-level features, local char-level features, and part-of-speech features at the word representation layer. The word representation vectors are inputs into BiLSTM as features to obtain the dependency information. Finally, the CRF algorithm is used to learn the features of the state sequences to obtain the global optimal tagging sequences. RESULTS: The experimental results showed that the model outperformed other state-of-the-art methods for all-around performance in six datasets among eight of four biomedical entity types. CONCLUSION: The proposed method has a positive effect on the prediction results. It comprehensively considers the relevant factors of named entity recognition because the semantic information is enhanced by fusing multi-features embedding. IOS Press 2023-04-28 /pmc/articles/PMC10258877/ /pubmed/37038786 http://dx.doi.org/10.3233/THC-236011 Text en © 2023 – The authors. Published by IOS Press. https://creativecommons.org/licenses/by-nc/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution Non-Commercial (CC BY-NC 4.0) License (https://creativecommons.org/licenses/by-nc/4.0/) , which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Li, Meijing
Yang, Hao
Liu, Yuxin
Biomedical named entity recognition based on fusion multi-features embedding
title Biomedical named entity recognition based on fusion multi-features embedding
title_full Biomedical named entity recognition based on fusion multi-features embedding
title_fullStr Biomedical named entity recognition based on fusion multi-features embedding
title_full_unstemmed Biomedical named entity recognition based on fusion multi-features embedding
title_short Biomedical named entity recognition based on fusion multi-features embedding
title_sort biomedical named entity recognition based on fusion multi-features embedding
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10258877/
https://www.ncbi.nlm.nih.gov/pubmed/37038786
http://dx.doi.org/10.3233/THC-236011
work_keys_str_mv AT limeijing biomedicalnamedentityrecognitionbasedonfusionmultifeaturesembedding
AT yanghao biomedicalnamedentityrecognitionbasedonfusionmultifeaturesembedding
AT liuyuxin biomedicalnamedentityrecognitionbasedonfusionmultifeaturesembedding