Cargando…

Improving Named Entity Recognition for Biomedical and Patent Data Using Bi-LSTM Deep Neural Network Models

The daily exponential increase of biomedical information in scientific literature and patents is a main obstacle to foster advances in biomedical research. A fundamental step hereby is to find key information (named entities) inside these publications applying Biomedical Named Entities Recognition (...

Descripción completa

Detalles Bibliográficos
Autores principales: Saad, Farag, Aras, Hidir, Hackl-Sommer, René
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7298184/
http://dx.doi.org/10.1007/978-3-030-51310-8_3
_version_ 1783547164485484544
author Saad, Farag
Aras, Hidir
Hackl-Sommer, René
author_facet Saad, Farag
Aras, Hidir
Hackl-Sommer, René
author_sort Saad, Farag
collection PubMed
description The daily exponential increase of biomedical information in scientific literature and patents is a main obstacle to foster advances in biomedical research. A fundamental step hereby is to find key information (named entities) inside these publications applying Biomedical Named Entities Recognition (BNER). However, BNER is a complex task compared to traditional NER as biomedical named entities often have irregular expressions, employ complex entity structures, and don’t consider well-defined entity boundaries, etc. In this paper, we propose a deep neural network (NN) architecture, namely the bidirectional Long-Short Term Memory (Bi-LSTM) based model for BNER. We present a detailed neural network architecture showing the different NN layers, their interconnections and transformations. Based on existing gold standard datasets, we evaluated and compared several models for identifying biomedical named entities such as chemicals, diseases, drugs, species and genes/proteins. Our deep NN based Bi-LSTM model using word and character level embeddings outperforms CRF and Bi-LSTM using only word level embeddings significantly.
format Online
Article
Text
id pubmed-7298184
institution National Center for Biotechnology Information
language English
publishDate 2020
record_format MEDLINE/PubMed
spelling pubmed-72981842020-06-17 Improving Named Entity Recognition for Biomedical and Patent Data Using Bi-LSTM Deep Neural Network Models Saad, Farag Aras, Hidir Hackl-Sommer, René Natural Language Processing and Information Systems Article The daily exponential increase of biomedical information in scientific literature and patents is a main obstacle to foster advances in biomedical research. A fundamental step hereby is to find key information (named entities) inside these publications applying Biomedical Named Entities Recognition (BNER). However, BNER is a complex task compared to traditional NER as biomedical named entities often have irregular expressions, employ complex entity structures, and don’t consider well-defined entity boundaries, etc. In this paper, we propose a deep neural network (NN) architecture, namely the bidirectional Long-Short Term Memory (Bi-LSTM) based model for BNER. We present a detailed neural network architecture showing the different NN layers, their interconnections and transformations. Based on existing gold standard datasets, we evaluated and compared several models for identifying biomedical named entities such as chemicals, diseases, drugs, species and genes/proteins. Our deep NN based Bi-LSTM model using word and character level embeddings outperforms CRF and Bi-LSTM using only word level embeddings significantly. 2020-05-26 /pmc/articles/PMC7298184/ http://dx.doi.org/10.1007/978-3-030-51310-8_3 Text en © Springer Nature Switzerland AG 2020 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle Article
Saad, Farag
Aras, Hidir
Hackl-Sommer, René
Improving Named Entity Recognition for Biomedical and Patent Data Using Bi-LSTM Deep Neural Network Models
title Improving Named Entity Recognition for Biomedical and Patent Data Using Bi-LSTM Deep Neural Network Models
title_full Improving Named Entity Recognition for Biomedical and Patent Data Using Bi-LSTM Deep Neural Network Models
title_fullStr Improving Named Entity Recognition for Biomedical and Patent Data Using Bi-LSTM Deep Neural Network Models
title_full_unstemmed Improving Named Entity Recognition for Biomedical and Patent Data Using Bi-LSTM Deep Neural Network Models
title_short Improving Named Entity Recognition for Biomedical and Patent Data Using Bi-LSTM Deep Neural Network Models
title_sort improving named entity recognition for biomedical and patent data using bi-lstm deep neural network models
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7298184/
http://dx.doi.org/10.1007/978-3-030-51310-8_3
work_keys_str_mv AT saadfarag improvingnamedentityrecognitionforbiomedicalandpatentdatausingbilstmdeepneuralnetworkmodels
AT arashidir improvingnamedentityrecognitionforbiomedicalandpatentdatausingbilstmdeepneuralnetworkmodels
AT hacklsommerrene improvingnamedentityrecognitionforbiomedicalandpatentdatausingbilstmdeepneuralnetworkmodels