Cargando…

Disease named entity recognition from biomedical literature using a novel convolutional neural network

BACKGROUND: Automatic disease named entity recognition (DNER) is of utmost importance for development of more sophisticated BioNLP tools. However, most conventional CRF based DNER systems rely on well-designed features whose selection is labor intensive and time-consuming. Though most deep learning...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhao, Zhehuan, Yang, Zhihao, Luo, Ling, Wang, Lei, Zhang, Yin, Lin, Hongfei, Wang, Jian
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5751782/
https://www.ncbi.nlm.nih.gov/pubmed/29297367
http://dx.doi.org/10.1186/s12920-017-0316-8
_version_ 1783290017240580096
author Zhao, Zhehuan
Yang, Zhihao
Luo, Ling
Wang, Lei
Zhang, Yin
Lin, Hongfei
Wang, Jian
author_facet Zhao, Zhehuan
Yang, Zhihao
Luo, Ling
Wang, Lei
Zhang, Yin
Lin, Hongfei
Wang, Jian
author_sort Zhao, Zhehuan
collection PubMed
description BACKGROUND: Automatic disease named entity recognition (DNER) is of utmost importance for development of more sophisticated BioNLP tools. However, most conventional CRF based DNER systems rely on well-designed features whose selection is labor intensive and time-consuming. Though most deep learning methods can solve NER problems with little feature engineering, they employ additional CRF layer to capture the correlation information between labels in neighborhoods which makes them much complicated. METHODS: In this paper, we propose a novel multiple label convolutional neural network (MCNN) based disease NER approach. In this approach, instead of the CRF layer, a multiple label strategy (MLS) first introduced by us, is employed. First, the character-level embedding, word-level embedding and lexicon feature embedding are concatenated. Then several convolutional layers are stacked over the concatenated embedding. Finally, MLS strategy is applied to the output layer to capture the correlation information between neighboring labels. RESULTS: As shown by the experimental results, MCNN can achieve the state-of-the-art performance on both NCBI and CDR corpora. CONCLUSIONS: The proposed MCNN based disease NER method achieves the state-of-the-art performance with little feature engineering. And the experimental results show the MLS strategy’s effectiveness of capturing the correlation information between labels in the neighborhood.
format Online
Article
Text
id pubmed-5751782
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-57517822018-01-05 Disease named entity recognition from biomedical literature using a novel convolutional neural network Zhao, Zhehuan Yang, Zhihao Luo, Ling Wang, Lei Zhang, Yin Lin, Hongfei Wang, Jian BMC Med Genomics Research BACKGROUND: Automatic disease named entity recognition (DNER) is of utmost importance for development of more sophisticated BioNLP tools. However, most conventional CRF based DNER systems rely on well-designed features whose selection is labor intensive and time-consuming. Though most deep learning methods can solve NER problems with little feature engineering, they employ additional CRF layer to capture the correlation information between labels in neighborhoods which makes them much complicated. METHODS: In this paper, we propose a novel multiple label convolutional neural network (MCNN) based disease NER approach. In this approach, instead of the CRF layer, a multiple label strategy (MLS) first introduced by us, is employed. First, the character-level embedding, word-level embedding and lexicon feature embedding are concatenated. Then several convolutional layers are stacked over the concatenated embedding. Finally, MLS strategy is applied to the output layer to capture the correlation information between neighboring labels. RESULTS: As shown by the experimental results, MCNN can achieve the state-of-the-art performance on both NCBI and CDR corpora. CONCLUSIONS: The proposed MCNN based disease NER method achieves the state-of-the-art performance with little feature engineering. And the experimental results show the MLS strategy’s effectiveness of capturing the correlation information between labels in the neighborhood. BioMed Central 2017-12-28 /pmc/articles/PMC5751782/ /pubmed/29297367 http://dx.doi.org/10.1186/s12920-017-0316-8 Text en © The Author(s). 2017 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Research
Zhao, Zhehuan
Yang, Zhihao
Luo, Ling
Wang, Lei
Zhang, Yin
Lin, Hongfei
Wang, Jian
Disease named entity recognition from biomedical literature using a novel convolutional neural network
title Disease named entity recognition from biomedical literature using a novel convolutional neural network
title_full Disease named entity recognition from biomedical literature using a novel convolutional neural network
title_fullStr Disease named entity recognition from biomedical literature using a novel convolutional neural network
title_full_unstemmed Disease named entity recognition from biomedical literature using a novel convolutional neural network
title_short Disease named entity recognition from biomedical literature using a novel convolutional neural network
title_sort disease named entity recognition from biomedical literature using a novel convolutional neural network
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5751782/
https://www.ncbi.nlm.nih.gov/pubmed/29297367
http://dx.doi.org/10.1186/s12920-017-0316-8
work_keys_str_mv AT zhaozhehuan diseasenamedentityrecognitionfrombiomedicalliteratureusinganovelconvolutionalneuralnetwork
AT yangzhihao diseasenamedentityrecognitionfrombiomedicalliteratureusinganovelconvolutionalneuralnetwork
AT luoling diseasenamedentityrecognitionfrombiomedicalliteratureusinganovelconvolutionalneuralnetwork
AT wanglei diseasenamedentityrecognitionfrombiomedicalliteratureusinganovelconvolutionalneuralnetwork
AT zhangyin diseasenamedentityrecognitionfrombiomedicalliteratureusinganovelconvolutionalneuralnetwork
AT linhongfei diseasenamedentityrecognitionfrombiomedicalliteratureusinganovelconvolutionalneuralnetwork
AT wangjian diseasenamedentityrecognitionfrombiomedicalliteratureusinganovelconvolutionalneuralnetwork