Cargando…
GRAM-CNN: a deep learning approach with local context for named entity recognition in biomedical text
MOTIVATION: Best performing named entity recognition (NER) methods for biomedical literature are based on hand-crafted features or task-specific rules, which are costly to produce and difficult to generalize to other corpora. End-to-end neural networks achieve state-of-the-art performance without ha...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5925775/ https://www.ncbi.nlm.nih.gov/pubmed/29272325 http://dx.doi.org/10.1093/bioinformatics/btx815 |
_version_ | 1783318771780288512 |
---|---|
author | Zhu, Qile Li, Xiaolin Conesa, Ana Pereira, Cécile |
author_facet | Zhu, Qile Li, Xiaolin Conesa, Ana Pereira, Cécile |
author_sort | Zhu, Qile |
collection | PubMed |
description | MOTIVATION: Best performing named entity recognition (NER) methods for biomedical literature are based on hand-crafted features or task-specific rules, which are costly to produce and difficult to generalize to other corpora. End-to-end neural networks achieve state-of-the-art performance without hand-crafted features and task-specific knowledge in non-biomedical NER tasks. However, in the biomedical domain, using the same architecture does not yield competitive performance compared with conventional machine learning models. RESULTS: We propose a novel end-to-end deep learning approach for biomedical NER tasks that leverages the local contexts based on n-gram character and word embeddings via Convolutional Neural Network (CNN). We call this approach GRAM-CNN. To automatically label a word, this method uses the local information around a word. Therefore, the GRAM-CNN method does not require any specific knowledge or feature engineering and can be theoretically applied to a wide range of existing NER problems. The GRAM-CNN approach was evaluated on three well-known biomedical datasets containing different BioNER entities. It obtained an F1-score of 87.26% on the Biocreative II dataset, 87.26% on the NCBI dataset and 72.57% on the JNLPBA dataset. Those results put GRAM-CNN in the lead of the biological NER methods. To the best of our knowledge, we are the first to apply CNN based structures to BioNER problems. AVAILABILITY AND IMPLEMENTATION: The GRAM-CNN source code, datasets and pre-trained model are available online at: https://github.com/valdersoul/GRAM-CNN. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-5925775 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-59257752018-05-04 GRAM-CNN: a deep learning approach with local context for named entity recognition in biomedical text Zhu, Qile Li, Xiaolin Conesa, Ana Pereira, Cécile Bioinformatics Original Papers MOTIVATION: Best performing named entity recognition (NER) methods for biomedical literature are based on hand-crafted features or task-specific rules, which are costly to produce and difficult to generalize to other corpora. End-to-end neural networks achieve state-of-the-art performance without hand-crafted features and task-specific knowledge in non-biomedical NER tasks. However, in the biomedical domain, using the same architecture does not yield competitive performance compared with conventional machine learning models. RESULTS: We propose a novel end-to-end deep learning approach for biomedical NER tasks that leverages the local contexts based on n-gram character and word embeddings via Convolutional Neural Network (CNN). We call this approach GRAM-CNN. To automatically label a word, this method uses the local information around a word. Therefore, the GRAM-CNN method does not require any specific knowledge or feature engineering and can be theoretically applied to a wide range of existing NER problems. The GRAM-CNN approach was evaluated on three well-known biomedical datasets containing different BioNER entities. It obtained an F1-score of 87.26% on the Biocreative II dataset, 87.26% on the NCBI dataset and 72.57% on the JNLPBA dataset. Those results put GRAM-CNN in the lead of the biological NER methods. To the best of our knowledge, we are the first to apply CNN based structures to BioNER problems. AVAILABILITY AND IMPLEMENTATION: The GRAM-CNN source code, datasets and pre-trained model are available online at: https://github.com/valdersoul/GRAM-CNN. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2018-05-01 2017-12-20 /pmc/articles/PMC5925775/ /pubmed/29272325 http://dx.doi.org/10.1093/bioinformatics/btx815 Text en © The Author(s) 2017. Published by Oxford University Press. http://creativecommons.org/licenses/by/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Papers Zhu, Qile Li, Xiaolin Conesa, Ana Pereira, Cécile GRAM-CNN: a deep learning approach with local context for named entity recognition in biomedical text |
title | GRAM-CNN: a deep learning approach with local context for named entity recognition in biomedical text |
title_full | GRAM-CNN: a deep learning approach with local context for named entity recognition in biomedical text |
title_fullStr | GRAM-CNN: a deep learning approach with local context for named entity recognition in biomedical text |
title_full_unstemmed | GRAM-CNN: a deep learning approach with local context for named entity recognition in biomedical text |
title_short | GRAM-CNN: a deep learning approach with local context for named entity recognition in biomedical text |
title_sort | gram-cnn: a deep learning approach with local context for named entity recognition in biomedical text |
topic | Original Papers |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5925775/ https://www.ncbi.nlm.nih.gov/pubmed/29272325 http://dx.doi.org/10.1093/bioinformatics/btx815 |
work_keys_str_mv | AT zhuqile gramcnnadeeplearningapproachwithlocalcontextfornamedentityrecognitioninbiomedicaltext AT lixiaolin gramcnnadeeplearningapproachwithlocalcontextfornamedentityrecognitioninbiomedicaltext AT conesaana gramcnnadeeplearningapproachwithlocalcontextfornamedentityrecognitioninbiomedicaltext AT pereiracecile gramcnnadeeplearningapproachwithlocalcontextfornamedentityrecognitioninbiomedicaltext |