Cargando…
A roadmap to neural automatic post-editing: an empirical approach
In a translation workflow, machine translation (MT) is almost always followed by a human post-editing step, where the raw MT output is corrected to meet required quality standards. To reduce the number of errors human translators need to correct, automatic post-editing (APE) methods have been develo...
Autores principales: | , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer Netherlands
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7501121/ https://www.ncbi.nlm.nih.gov/pubmed/33012986 http://dx.doi.org/10.1007/s10590-020-09249-7 |
_version_ | 1783583990451535872 |
---|---|
author | Shterionov, Dimitar Carmo, Félix do Moorkens, Joss Hossari, Murhaf Wagner, Joachim Paquin, Eric Schmidtke, Dag Groves, Declan Way, Andy |
author_facet | Shterionov, Dimitar Carmo, Félix do Moorkens, Joss Hossari, Murhaf Wagner, Joachim Paquin, Eric Schmidtke, Dag Groves, Declan Way, Andy |
author_sort | Shterionov, Dimitar |
collection | PubMed |
description | In a translation workflow, machine translation (MT) is almost always followed by a human post-editing step, where the raw MT output is corrected to meet required quality standards. To reduce the number of errors human translators need to correct, automatic post-editing (APE) methods have been developed and deployed in such workflows. With the advances in deep learning, neural APE (NPE) systems have outranked more traditional, statistical, ones. However, the plethora of options, variables and settings, as well as the relation between NPE performance and train/test data makes it difficult to select the most suitable approach for a given use case. In this article, we systematically analyse these different parameters with respect to NPE performance. We build an NPE “roadmap” to trace the different decision points and train a set of systems selecting different options through the roadmap. We also propose a novel approach for APE with data augmentation. We then analyse the performance of 15 of these systems and identify the best ones. In fact, the best systems are the ones that follow the newly-proposed method. The work presented in this article follows from a collaborative project between Microsoft and the ADAPT centre. The data provided by Microsoft originates from phrase-based statistical MT (PBSMT) systems employed in production. All tested NPE systems significantly increase the translation quality, proving the effectiveness of neural post-editing in the context of a commercial translation workflow that leverages PBSMT. |
format | Online Article Text |
id | pubmed-7501121 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Springer Netherlands |
record_format | MEDLINE/PubMed |
spelling | pubmed-75011212020-10-01 A roadmap to neural automatic post-editing: an empirical approach Shterionov, Dimitar Carmo, Félix do Moorkens, Joss Hossari, Murhaf Wagner, Joachim Paquin, Eric Schmidtke, Dag Groves, Declan Way, Andy Mach Transl Article In a translation workflow, machine translation (MT) is almost always followed by a human post-editing step, where the raw MT output is corrected to meet required quality standards. To reduce the number of errors human translators need to correct, automatic post-editing (APE) methods have been developed and deployed in such workflows. With the advances in deep learning, neural APE (NPE) systems have outranked more traditional, statistical, ones. However, the plethora of options, variables and settings, as well as the relation between NPE performance and train/test data makes it difficult to select the most suitable approach for a given use case. In this article, we systematically analyse these different parameters with respect to NPE performance. We build an NPE “roadmap” to trace the different decision points and train a set of systems selecting different options through the roadmap. We also propose a novel approach for APE with data augmentation. We then analyse the performance of 15 of these systems and identify the best ones. In fact, the best systems are the ones that follow the newly-proposed method. The work presented in this article follows from a collaborative project between Microsoft and the ADAPT centre. The data provided by Microsoft originates from phrase-based statistical MT (PBSMT) systems employed in production. All tested NPE systems significantly increase the translation quality, proving the effectiveness of neural post-editing in the context of a commercial translation workflow that leverages PBSMT. Springer Netherlands 2020-09-03 2020 /pmc/articles/PMC7501121/ /pubmed/33012986 http://dx.doi.org/10.1007/s10590-020-09249-7 Text en © The Author(s) 2020 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. |
spellingShingle | Article Shterionov, Dimitar Carmo, Félix do Moorkens, Joss Hossari, Murhaf Wagner, Joachim Paquin, Eric Schmidtke, Dag Groves, Declan Way, Andy A roadmap to neural automatic post-editing: an empirical approach |
title | A roadmap to neural automatic post-editing: an empirical approach |
title_full | A roadmap to neural automatic post-editing: an empirical approach |
title_fullStr | A roadmap to neural automatic post-editing: an empirical approach |
title_full_unstemmed | A roadmap to neural automatic post-editing: an empirical approach |
title_short | A roadmap to neural automatic post-editing: an empirical approach |
title_sort | roadmap to neural automatic post-editing: an empirical approach |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7501121/ https://www.ncbi.nlm.nih.gov/pubmed/33012986 http://dx.doi.org/10.1007/s10590-020-09249-7 |
work_keys_str_mv | AT shterionovdimitar aroadmaptoneuralautomaticposteditinganempiricalapproach AT carmofelixdo aroadmaptoneuralautomaticposteditinganempiricalapproach AT moorkensjoss aroadmaptoneuralautomaticposteditinganempiricalapproach AT hossarimurhaf aroadmaptoneuralautomaticposteditinganempiricalapproach AT wagnerjoachim aroadmaptoneuralautomaticposteditinganempiricalapproach AT paquineric aroadmaptoneuralautomaticposteditinganempiricalapproach AT schmidtkedag aroadmaptoneuralautomaticposteditinganempiricalapproach AT grovesdeclan aroadmaptoneuralautomaticposteditinganempiricalapproach AT wayandy aroadmaptoneuralautomaticposteditinganempiricalapproach AT shterionovdimitar roadmaptoneuralautomaticposteditinganempiricalapproach AT carmofelixdo roadmaptoneuralautomaticposteditinganempiricalapproach AT moorkensjoss roadmaptoneuralautomaticposteditinganempiricalapproach AT hossarimurhaf roadmaptoneuralautomaticposteditinganempiricalapproach AT wagnerjoachim roadmaptoneuralautomaticposteditinganempiricalapproach AT paquineric roadmaptoneuralautomaticposteditinganempiricalapproach AT schmidtkedag roadmaptoneuralautomaticposteditinganempiricalapproach AT grovesdeclan roadmaptoneuralautomaticposteditinganempiricalapproach AT wayandy roadmaptoneuralautomaticposteditinganempiricalapproach |