Cargando…

A roadmap to neural automatic post-editing: an empirical approach

In a translation workflow, machine translation (MT) is almost always followed by a human post-editing step, where the raw MT output is corrected to meet required quality standards. To reduce the number of errors human translators need to correct, automatic post-editing (APE) methods have been develo...

Descripción completa

Detalles Bibliográficos
Autores principales: Shterionov, Dimitar, Carmo, Félix do, Moorkens, Joss, Hossari, Murhaf, Wagner, Joachim, Paquin, Eric, Schmidtke, Dag, Groves, Declan, Way, Andy
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer Netherlands 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7501121/
https://www.ncbi.nlm.nih.gov/pubmed/33012986
http://dx.doi.org/10.1007/s10590-020-09249-7
_version_ 1783583990451535872
author Shterionov, Dimitar
Carmo, Félix do
Moorkens, Joss
Hossari, Murhaf
Wagner, Joachim
Paquin, Eric
Schmidtke, Dag
Groves, Declan
Way, Andy
author_facet Shterionov, Dimitar
Carmo, Félix do
Moorkens, Joss
Hossari, Murhaf
Wagner, Joachim
Paquin, Eric
Schmidtke, Dag
Groves, Declan
Way, Andy
author_sort Shterionov, Dimitar
collection PubMed
description In a translation workflow, machine translation (MT) is almost always followed by a human post-editing step, where the raw MT output is corrected to meet required quality standards. To reduce the number of errors human translators need to correct, automatic post-editing (APE) methods have been developed and deployed in such workflows. With the advances in deep learning, neural APE (NPE) systems have outranked more traditional, statistical, ones. However, the plethora of options, variables and settings, as well as the relation between NPE performance and train/test data makes it difficult to select the most suitable approach for a given use case. In this article, we systematically analyse these different parameters with respect to NPE performance. We build an NPE “roadmap” to trace the different decision points and train a set of systems selecting different options through the roadmap. We also propose a novel approach for APE with data augmentation. We then analyse the performance of 15 of these systems and identify the best ones. In fact, the best systems are the ones that follow the newly-proposed method. The work presented in this article follows from a collaborative project between Microsoft and the ADAPT centre. The data provided by Microsoft originates from phrase-based statistical MT (PBSMT) systems employed in production. All tested NPE systems significantly increase the translation quality, proving the effectiveness of neural post-editing in the context of a commercial translation workflow that leverages PBSMT.
format Online
Article
Text
id pubmed-7501121
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Springer Netherlands
record_format MEDLINE/PubMed
spelling pubmed-75011212020-10-01 A roadmap to neural automatic post-editing: an empirical approach Shterionov, Dimitar Carmo, Félix do Moorkens, Joss Hossari, Murhaf Wagner, Joachim Paquin, Eric Schmidtke, Dag Groves, Declan Way, Andy Mach Transl Article In a translation workflow, machine translation (MT) is almost always followed by a human post-editing step, where the raw MT output is corrected to meet required quality standards. To reduce the number of errors human translators need to correct, automatic post-editing (APE) methods have been developed and deployed in such workflows. With the advances in deep learning, neural APE (NPE) systems have outranked more traditional, statistical, ones. However, the plethora of options, variables and settings, as well as the relation between NPE performance and train/test data makes it difficult to select the most suitable approach for a given use case. In this article, we systematically analyse these different parameters with respect to NPE performance. We build an NPE “roadmap” to trace the different decision points and train a set of systems selecting different options through the roadmap. We also propose a novel approach for APE with data augmentation. We then analyse the performance of 15 of these systems and identify the best ones. In fact, the best systems are the ones that follow the newly-proposed method. The work presented in this article follows from a collaborative project between Microsoft and the ADAPT centre. The data provided by Microsoft originates from phrase-based statistical MT (PBSMT) systems employed in production. All tested NPE systems significantly increase the translation quality, proving the effectiveness of neural post-editing in the context of a commercial translation workflow that leverages PBSMT. Springer Netherlands 2020-09-03 2020 /pmc/articles/PMC7501121/ /pubmed/33012986 http://dx.doi.org/10.1007/s10590-020-09249-7 Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Shterionov, Dimitar
Carmo, Félix do
Moorkens, Joss
Hossari, Murhaf
Wagner, Joachim
Paquin, Eric
Schmidtke, Dag
Groves, Declan
Way, Andy
A roadmap to neural automatic post-editing: an empirical approach
title A roadmap to neural automatic post-editing: an empirical approach
title_full A roadmap to neural automatic post-editing: an empirical approach
title_fullStr A roadmap to neural automatic post-editing: an empirical approach
title_full_unstemmed A roadmap to neural automatic post-editing: an empirical approach
title_short A roadmap to neural automatic post-editing: an empirical approach
title_sort roadmap to neural automatic post-editing: an empirical approach
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7501121/
https://www.ncbi.nlm.nih.gov/pubmed/33012986
http://dx.doi.org/10.1007/s10590-020-09249-7
work_keys_str_mv AT shterionovdimitar aroadmaptoneuralautomaticposteditinganempiricalapproach
AT carmofelixdo aroadmaptoneuralautomaticposteditinganempiricalapproach
AT moorkensjoss aroadmaptoneuralautomaticposteditinganempiricalapproach
AT hossarimurhaf aroadmaptoneuralautomaticposteditinganempiricalapproach
AT wagnerjoachim aroadmaptoneuralautomaticposteditinganempiricalapproach
AT paquineric aroadmaptoneuralautomaticposteditinganempiricalapproach
AT schmidtkedag aroadmaptoneuralautomaticposteditinganempiricalapproach
AT grovesdeclan aroadmaptoneuralautomaticposteditinganempiricalapproach
AT wayandy aroadmaptoneuralautomaticposteditinganempiricalapproach
AT shterionovdimitar roadmaptoneuralautomaticposteditinganempiricalapproach
AT carmofelixdo roadmaptoneuralautomaticposteditinganempiricalapproach
AT moorkensjoss roadmaptoneuralautomaticposteditinganempiricalapproach
AT hossarimurhaf roadmaptoneuralautomaticposteditinganempiricalapproach
AT wagnerjoachim roadmaptoneuralautomaticposteditinganempiricalapproach
AT paquineric roadmaptoneuralautomaticposteditinganempiricalapproach
AT schmidtkedag roadmaptoneuralautomaticposteditinganempiricalapproach
AT grovesdeclan roadmaptoneuralautomaticposteditinganempiricalapproach
AT wayandy roadmaptoneuralautomaticposteditinganempiricalapproach