Cargando…
Improving Named Entity Recognition for Biomedical and Patent Data Using Bi-LSTM Deep Neural Network Models
The daily exponential increase of biomedical information in scientific literature and patents is a main obstacle to foster advances in biomedical research. A fundamental step hereby is to find key information (named entities) inside these publications applying Biomedical Named Entities Recognition (...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7298184/ http://dx.doi.org/10.1007/978-3-030-51310-8_3 |
_version_ | 1783547164485484544 |
---|---|
author | Saad, Farag Aras, Hidir Hackl-Sommer, René |
author_facet | Saad, Farag Aras, Hidir Hackl-Sommer, René |
author_sort | Saad, Farag |
collection | PubMed |
description | The daily exponential increase of biomedical information in scientific literature and patents is a main obstacle to foster advances in biomedical research. A fundamental step hereby is to find key information (named entities) inside these publications applying Biomedical Named Entities Recognition (BNER). However, BNER is a complex task compared to traditional NER as biomedical named entities often have irregular expressions, employ complex entity structures, and don’t consider well-defined entity boundaries, etc. In this paper, we propose a deep neural network (NN) architecture, namely the bidirectional Long-Short Term Memory (Bi-LSTM) based model for BNER. We present a detailed neural network architecture showing the different NN layers, their interconnections and transformations. Based on existing gold standard datasets, we evaluated and compared several models for identifying biomedical named entities such as chemicals, diseases, drugs, species and genes/proteins. Our deep NN based Bi-LSTM model using word and character level embeddings outperforms CRF and Bi-LSTM using only word level embeddings significantly. |
format | Online Article Text |
id | pubmed-7298184 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
record_format | MEDLINE/PubMed |
spelling | pubmed-72981842020-06-17 Improving Named Entity Recognition for Biomedical and Patent Data Using Bi-LSTM Deep Neural Network Models Saad, Farag Aras, Hidir Hackl-Sommer, René Natural Language Processing and Information Systems Article The daily exponential increase of biomedical information in scientific literature and patents is a main obstacle to foster advances in biomedical research. A fundamental step hereby is to find key information (named entities) inside these publications applying Biomedical Named Entities Recognition (BNER). However, BNER is a complex task compared to traditional NER as biomedical named entities often have irregular expressions, employ complex entity structures, and don’t consider well-defined entity boundaries, etc. In this paper, we propose a deep neural network (NN) architecture, namely the bidirectional Long-Short Term Memory (Bi-LSTM) based model for BNER. We present a detailed neural network architecture showing the different NN layers, their interconnections and transformations. Based on existing gold standard datasets, we evaluated and compared several models for identifying biomedical named entities such as chemicals, diseases, drugs, species and genes/proteins. Our deep NN based Bi-LSTM model using word and character level embeddings outperforms CRF and Bi-LSTM using only word level embeddings significantly. 2020-05-26 /pmc/articles/PMC7298184/ http://dx.doi.org/10.1007/978-3-030-51310-8_3 Text en © Springer Nature Switzerland AG 2020 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic. |
spellingShingle | Article Saad, Farag Aras, Hidir Hackl-Sommer, René Improving Named Entity Recognition for Biomedical and Patent Data Using Bi-LSTM Deep Neural Network Models |
title | Improving Named Entity Recognition for Biomedical and Patent Data Using Bi-LSTM Deep Neural Network Models |
title_full | Improving Named Entity Recognition for Biomedical and Patent Data Using Bi-LSTM Deep Neural Network Models |
title_fullStr | Improving Named Entity Recognition for Biomedical and Patent Data Using Bi-LSTM Deep Neural Network Models |
title_full_unstemmed | Improving Named Entity Recognition for Biomedical and Patent Data Using Bi-LSTM Deep Neural Network Models |
title_short | Improving Named Entity Recognition for Biomedical and Patent Data Using Bi-LSTM Deep Neural Network Models |
title_sort | improving named entity recognition for biomedical and patent data using bi-lstm deep neural network models |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7298184/ http://dx.doi.org/10.1007/978-3-030-51310-8_3 |
work_keys_str_mv | AT saadfarag improvingnamedentityrecognitionforbiomedicalandpatentdatausingbilstmdeepneuralnetworkmodels AT arashidir improvingnamedentityrecognitionforbiomedicalandpatentdatausingbilstmdeepneuralnetworkmodels AT hacklsommerrene improvingnamedentityrecognitionforbiomedicalandpatentdatausingbilstmdeepneuralnetworkmodels |