Cargando…

Case-Sensitive Neural Machine Translation

Even as an important lexical information for Latin languages, word case is often ignored in machine translation. According to observations, the translation performance drops significantly when we introduce case-sensitive evaluation metrics. In this paper, we introduce two types of case-sensitive neu...

Descripción completa

Detalles Bibliográficos
Autores principales: Shi, Xuewen, Huang, Heyan, Jian, Ping, Tang, Yi-Kun
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7206154/
http://dx.doi.org/10.1007/978-3-030-47426-3_51
_version_ 1783530357610512384
author Shi, Xuewen
Huang, Heyan
Jian, Ping
Tang, Yi-Kun
author_facet Shi, Xuewen
Huang, Heyan
Jian, Ping
Tang, Yi-Kun
author_sort Shi, Xuewen
collection PubMed
description Even as an important lexical information for Latin languages, word case is often ignored in machine translation. According to observations, the translation performance drops significantly when we introduce case-sensitive evaluation metrics. In this paper, we introduce two types of case-sensitive neural machine translation (NMT) approaches to alleviate the above problems: i) adding case tokens into the decoding sequence, and ii) adopting case prediction to the conventional NMT. Our proposed approaches incorporate case information to the NMT decoder by jointly learning target word generation and word case prediction. We compare our approaches with multiple kinds of baselines including NMT with naive case-restoration methods and analyze the impacts of various setups on our approaches. Experimental results on three typical translation tasks (Zh-En, En-Fr, En-De) show that our proposed methods lead to the improvements up to 2.5, 1.0 and 0.5 in case-sensitive BLEU scores respectively. Further analyses also illustrate the inherent reasons why our approaches lead to different improvements on different translation tasks.
format Online
Article
Text
id pubmed-7206154
institution National Center for Biotechnology Information
language English
publishDate 2020
record_format MEDLINE/PubMed
spelling pubmed-72061542020-05-08 Case-Sensitive Neural Machine Translation Shi, Xuewen Huang, Heyan Jian, Ping Tang, Yi-Kun Advances in Knowledge Discovery and Data Mining Article Even as an important lexical information for Latin languages, word case is often ignored in machine translation. According to observations, the translation performance drops significantly when we introduce case-sensitive evaluation metrics. In this paper, we introduce two types of case-sensitive neural machine translation (NMT) approaches to alleviate the above problems: i) adding case tokens into the decoding sequence, and ii) adopting case prediction to the conventional NMT. Our proposed approaches incorporate case information to the NMT decoder by jointly learning target word generation and word case prediction. We compare our approaches with multiple kinds of baselines including NMT with naive case-restoration methods and analyze the impacts of various setups on our approaches. Experimental results on three typical translation tasks (Zh-En, En-Fr, En-De) show that our proposed methods lead to the improvements up to 2.5, 1.0 and 0.5 in case-sensitive BLEU scores respectively. Further analyses also illustrate the inherent reasons why our approaches lead to different improvements on different translation tasks. 2020-04-17 /pmc/articles/PMC7206154/ http://dx.doi.org/10.1007/978-3-030-47426-3_51 Text en © Springer Nature Switzerland AG 2020 This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle Article
Shi, Xuewen
Huang, Heyan
Jian, Ping
Tang, Yi-Kun
Case-Sensitive Neural Machine Translation
title Case-Sensitive Neural Machine Translation
title_full Case-Sensitive Neural Machine Translation
title_fullStr Case-Sensitive Neural Machine Translation
title_full_unstemmed Case-Sensitive Neural Machine Translation
title_short Case-Sensitive Neural Machine Translation
title_sort case-sensitive neural machine translation
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7206154/
http://dx.doi.org/10.1007/978-3-030-47426-3_51
work_keys_str_mv AT shixuewen casesensitiveneuralmachinetranslation
AT huangheyan casesensitiveneuralmachinetranslation
AT jianping casesensitiveneuralmachinetranslation
AT tangyikun casesensitiveneuralmachinetranslation