Cargando…

An error analysis for image-based multi-modal neural machine translation

In this article, we conduct an extensive quantitative error analysis of different multi-modal neural machine translation (MNMT) models which integrate visual features into different parts of both the encoder and the decoder. We investigate the scenario where models are trained on an in-domain traini...

Descripción completa

Detalles Bibliográficos
Autores principales:	Calixto, Iacer, Liu, Qun
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Springer Netherlands 2019
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6579783/ https://www.ncbi.nlm.nih.gov/pubmed/31281206 http://dx.doi.org/10.1007/s10590-019-09226-9

_version_	1783427901339729920
author	Calixto, Iacer Liu, Qun
author_facet	Calixto, Iacer Liu, Qun
author_sort	Calixto, Iacer
collection	PubMed
description	In this article, we conduct an extensive quantitative error analysis of different multi-modal neural machine translation (MNMT) models which integrate visual features into different parts of both the encoder and the decoder. We investigate the scenario where models are trained on an in-domain training data set of parallel sentence pairs with images. We analyse two different types of MNMT models, that use global and local image features: the latter encode an image globally, i.e. there is one feature vector representing an entire image, whereas the former encode spatial information, i.e. there are multiple feature vectors, each encoding different portions of the image. We conduct an error analysis of translations generated by different MNMT models as well as text-only baselines, where we study how multi-modal models compare when translating both visual and non-visual terms. In general, we find that the additional multi-modal signals consistently improve translations, even more so when using simpler MNMT models that use global visual features. We also find that not only translations of terms with a strong visual connotation are improved, but almost all kinds of errors decreased when using multi-modal models.
format	Online Article Text
id	pubmed-6579783
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	Springer Netherlands
record_format	MEDLINE/PubMed
spelling	pubmed-65797832019-07-03 An error analysis for image-based multi-modal neural machine translation Calixto, Iacer Liu, Qun Mach Transl Article In this article, we conduct an extensive quantitative error analysis of different multi-modal neural machine translation (MNMT) models which integrate visual features into different parts of both the encoder and the decoder. We investigate the scenario where models are trained on an in-domain training data set of parallel sentence pairs with images. We analyse two different types of MNMT models, that use global and local image features: the latter encode an image globally, i.e. there is one feature vector representing an entire image, whereas the former encode spatial information, i.e. there are multiple feature vectors, each encoding different portions of the image. We conduct an error analysis of translations generated by different MNMT models as well as text-only baselines, where we study how multi-modal models compare when translating both visual and non-visual terms. In general, we find that the additional multi-modal signals consistently improve translations, even more so when using simpler MNMT models that use global visual features. We also find that not only translations of terms with a strong visual connotation are improved, but almost all kinds of errors decreased when using multi-modal models. Springer Netherlands 2019-04-08 2019 /pmc/articles/PMC6579783/ /pubmed/31281206 http://dx.doi.org/10.1007/s10590-019-09226-9 Text en © The Author(s) 2019 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
spellingShingle	Article Calixto, Iacer Liu, Qun An error analysis for image-based multi-modal neural machine translation
title	An error analysis for image-based multi-modal neural machine translation
title_full	An error analysis for image-based multi-modal neural machine translation
title_fullStr	An error analysis for image-based multi-modal neural machine translation
title_full_unstemmed	An error analysis for image-based multi-modal neural machine translation
title_short	An error analysis for image-based multi-modal neural machine translation
title_sort	error analysis for image-based multi-modal neural machine translation
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6579783/ https://www.ncbi.nlm.nih.gov/pubmed/31281206 http://dx.doi.org/10.1007/s10590-019-09226-9
work_keys_str_mv	AT calixtoiacer anerroranalysisforimagebasedmultimodalneuralmachinetranslation AT liuqun anerroranalysisforimagebasedmultimodalneuralmachinetranslation AT calixtoiacer erroranalysisforimagebasedmultimodalneuralmachinetranslation AT liuqun erroranalysisforimagebasedmultimodalneuralmachinetranslation

An error analysis for image-based multi-modal neural machine translation

Ejemplares similares