Cargando…

A deep learning approach to bilingual lexicon induction in the biomedical domain

BACKGROUND: Bilingual lexicon induction (BLI) is an important task in the biomedical domain as translation resources are usually available for general language usage, but are often lacking in domain-specific settings. In this article we consider BLI as a classification problem and train a neural net...

Descripción completa

Detalles Bibliográficos
Autores principales:	Heyman, Geert, Vulić, Ivan, Moens, Marie-Francine
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2018
Materias:	Methodology Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6038323/ https://www.ncbi.nlm.nih.gov/pubmed/29986664 http://dx.doi.org/10.1186/s12859-018-2245-8

_version_	1783338477449904128
author	Heyman, Geert Vulić, Ivan Moens, Marie-Francine
author_facet	Heyman, Geert Vulić, Ivan Moens, Marie-Francine
author_sort	Heyman, Geert
collection	PubMed
description	BACKGROUND: Bilingual lexicon induction (BLI) is an important task in the biomedical domain as translation resources are usually available for general language usage, but are often lacking in domain-specific settings. In this article we consider BLI as a classification problem and train a neural network composed of a combination of recurrent long short-term memory and deep feed-forward networks in order to obtain word-level and character-level representations. RESULTS: The results show that the word-level and character-level representations each improve state-of-the-art results for BLI and biomedical translation mining. The best results are obtained by exploiting the synergy between these word-level and character-level representations in the classification model. We evaluate the models both quantitatively and qualitatively. CONCLUSIONS: Translation of domain-specific biomedical terminology benefits from the character-level representations compared to relying solely on word-level representations. It is beneficial to take a deep learning approach and learn character-level representations rather than relying on handcrafted representations that are typically used. Our combined model captures the semantics at the word level while also taking into account that specialized terminology often originates from a common root form (e.g., from Greek or Latin).
format	Online Article Text
id	pubmed-6038323
institution	National Center for Biotechnology Information
language	English
publishDate	2018
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-60383232018-07-12 A deep learning approach to bilingual lexicon induction in the biomedical domain Heyman, Geert Vulić, Ivan Moens, Marie-Francine BMC Bioinformatics Methodology Article BACKGROUND: Bilingual lexicon induction (BLI) is an important task in the biomedical domain as translation resources are usually available for general language usage, but are often lacking in domain-specific settings. In this article we consider BLI as a classification problem and train a neural network composed of a combination of recurrent long short-term memory and deep feed-forward networks in order to obtain word-level and character-level representations. RESULTS: The results show that the word-level and character-level representations each improve state-of-the-art results for BLI and biomedical translation mining. The best results are obtained by exploiting the synergy between these word-level and character-level representations in the classification model. We evaluate the models both quantitatively and qualitatively. CONCLUSIONS: Translation of domain-specific biomedical terminology benefits from the character-level representations compared to relying solely on word-level representations. It is beneficial to take a deep learning approach and learn character-level representations rather than relying on handcrafted representations that are typically used. Our combined model captures the semantics at the word level while also taking into account that specialized terminology often originates from a common root form (e.g., from Greek or Latin). BioMed Central 2018-07-09 /pmc/articles/PMC6038323/ /pubmed/29986664 http://dx.doi.org/10.1186/s12859-018-2245-8 Text en © The Author(s) 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License(http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Methodology Article Heyman, Geert Vulić, Ivan Moens, Marie-Francine A deep learning approach to bilingual lexicon induction in the biomedical domain
title	A deep learning approach to bilingual lexicon induction in the biomedical domain
title_full	A deep learning approach to bilingual lexicon induction in the biomedical domain
title_fullStr	A deep learning approach to bilingual lexicon induction in the biomedical domain
title_full_unstemmed	A deep learning approach to bilingual lexicon induction in the biomedical domain
title_short	A deep learning approach to bilingual lexicon induction in the biomedical domain
title_sort	deep learning approach to bilingual lexicon induction in the biomedical domain
topic	Methodology Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6038323/ https://www.ncbi.nlm.nih.gov/pubmed/29986664 http://dx.doi.org/10.1186/s12859-018-2245-8
work_keys_str_mv	AT heymangeert adeeplearningapproachtobilinguallexiconinductioninthebiomedicaldomain AT vulicivan adeeplearningapproachtobilinguallexiconinductioninthebiomedicaldomain AT moensmariefrancine adeeplearningapproachtobilinguallexiconinductioninthebiomedicaldomain AT heymangeert deeplearningapproachtobilinguallexiconinductioninthebiomedicaldomain AT vulicivan deeplearningapproachtobilinguallexiconinductioninthebiomedicaldomain AT moensmariefrancine deeplearningapproachtobilinguallexiconinductioninthebiomedicaldomain

A deep learning approach to bilingual lexicon induction in the biomedical domain

Ejemplares similares