Cargando…

Biomedical word sense disambiguation with bidirectional long short-term memory and attention-based neural networks

BACKGROUND: In recent years, deep learning methods have been applied to many natural language processing tasks to achieve state-of-the-art performance. However, in the biomedical domain, they have not out-performed supervised word sense disambiguation (WSD) methods based on support vector machines o...

Descripción completa

Detalles Bibliográficos
Autores principales:	Zhang, Canlin, Biś, Daniel, Liu, Xiuwen, He, Zhe
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2019
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6886160/ https://www.ncbi.nlm.nih.gov/pubmed/31787096 http://dx.doi.org/10.1186/s12859-019-3079-8

_version_	1783474828000362496
author	Zhang, Canlin Biś, Daniel Liu, Xiuwen He, Zhe
author_facet	Zhang, Canlin Biś, Daniel Liu, Xiuwen He, Zhe
author_sort	Zhang, Canlin
collection	PubMed
description	BACKGROUND: In recent years, deep learning methods have been applied to many natural language processing tasks to achieve state-of-the-art performance. However, in the biomedical domain, they have not out-performed supervised word sense disambiguation (WSD) methods based on support vector machines or random forests, possibly due to inherent similarities of medical word senses. RESULTS: In this paper, we propose two deep-learning-based models for supervised WSD: a model based on bi-directional long short-term memory (BiLSTM) network, and an attention model based on self-attention architecture. Our result shows that the BiLSTM neural network model with a suitable upper layer structure performs even better than the existing state-of-the-art models on the MSH WSD dataset, while our attention model was 3 or 4 times faster than our BiLSTM model with good accuracy. In addition, we trained “universal” models in order to disambiguate all ambiguous words together. That is, we concatenate the embedding of the target ambiguous word to the max-pooled vector in the universal models, acting as a “hint”. The result shows that our universal BiLSTM neural network model yielded about 90 percent accuracy. CONCLUSION: Deep contextual models based on sequential information processing methods are able to capture the relative contextual information from pre-trained input word embeddings, in order to provide state-of-the-art results for supervised biomedical WSD tasks.
format	Online Article Text
id	pubmed-6886160
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-68861602019-12-11 Biomedical word sense disambiguation with bidirectional long short-term memory and attention-based neural networks Zhang, Canlin Biś, Daniel Liu, Xiuwen He, Zhe BMC Bioinformatics Research BACKGROUND: In recent years, deep learning methods have been applied to many natural language processing tasks to achieve state-of-the-art performance. However, in the biomedical domain, they have not out-performed supervised word sense disambiguation (WSD) methods based on support vector machines or random forests, possibly due to inherent similarities of medical word senses. RESULTS: In this paper, we propose two deep-learning-based models for supervised WSD: a model based on bi-directional long short-term memory (BiLSTM) network, and an attention model based on self-attention architecture. Our result shows that the BiLSTM neural network model with a suitable upper layer structure performs even better than the existing state-of-the-art models on the MSH WSD dataset, while our attention model was 3 or 4 times faster than our BiLSTM model with good accuracy. In addition, we trained “universal” models in order to disambiguate all ambiguous words together. That is, we concatenate the embedding of the target ambiguous word to the max-pooled vector in the universal models, acting as a “hint”. The result shows that our universal BiLSTM neural network model yielded about 90 percent accuracy. CONCLUSION: Deep contextual models based on sequential information processing methods are able to capture the relative contextual information from pre-trained input word embeddings, in order to provide state-of-the-art results for supervised biomedical WSD tasks. BioMed Central 2019-12-02 /pmc/articles/PMC6886160/ /pubmed/31787096 http://dx.doi.org/10.1186/s12859-019-3079-8 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Research Zhang, Canlin Biś, Daniel Liu, Xiuwen He, Zhe Biomedical word sense disambiguation with bidirectional long short-term memory and attention-based neural networks
title	Biomedical word sense disambiguation with bidirectional long short-term memory and attention-based neural networks
title_full	Biomedical word sense disambiguation with bidirectional long short-term memory and attention-based neural networks
title_fullStr	Biomedical word sense disambiguation with bidirectional long short-term memory and attention-based neural networks
title_full_unstemmed	Biomedical word sense disambiguation with bidirectional long short-term memory and attention-based neural networks
title_short	Biomedical word sense disambiguation with bidirectional long short-term memory and attention-based neural networks
title_sort	biomedical word sense disambiguation with bidirectional long short-term memory and attention-based neural networks
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6886160/ https://www.ncbi.nlm.nih.gov/pubmed/31787096 http://dx.doi.org/10.1186/s12859-019-3079-8
work_keys_str_mv	AT zhangcanlin biomedicalwordsensedisambiguationwithbidirectionallongshorttermmemoryandattentionbasedneuralnetworks AT bisdaniel biomedicalwordsensedisambiguationwithbidirectionallongshorttermmemoryandattentionbasedneuralnetworks AT liuxiuwen biomedicalwordsensedisambiguationwithbidirectionallongshorttermmemoryandattentionbasedneuralnetworks AT hezhe biomedicalwordsensedisambiguationwithbidirectionallongshorttermmemoryandattentionbasedneuralnetworks

Biomedical word sense disambiguation with bidirectional long short-term memory and attention-based neural networks

Ejemplares similares