Cargando…

A Relationship: Word Alignment, Phrase Table, and Translation Quality

In the last years, researchers conducted several studies to evaluate the machine translation quality based on the relationship between word alignments and phrase table. However, existing methods usually employ ad-hoc heuristics without theoretical support. So far, there is no discussion from the asp...

Descripción completa

Detalles Bibliográficos
Autores principales: Tian, Liang, Wong, Derek F., Chao, Lidia S., Oliveira, Francisco
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi Publishing Corporation 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4030493/
https://www.ncbi.nlm.nih.gov/pubmed/24883402
http://dx.doi.org/10.1155/2014/438106
_version_ 1782317397277933568
author Tian, Liang
Wong, Derek F.
Chao, Lidia S.
Oliveira, Francisco
author_facet Tian, Liang
Wong, Derek F.
Chao, Lidia S.
Oliveira, Francisco
author_sort Tian, Liang
collection PubMed
description In the last years, researchers conducted several studies to evaluate the machine translation quality based on the relationship between word alignments and phrase table. However, existing methods usually employ ad-hoc heuristics without theoretical support. So far, there is no discussion from the aspect of providing a formula to describe the relationship among word alignments, phrase table, and machine translation performance. In this paper, on one hand, we focus on formulating such a relationship for estimating the size of extracted phrase pairs given one or more word alignment points. On the other hand, a corpus-motivated pruning technique is proposed to prune the default large phrase table. Experiment proves that the deduced formula is feasible, which not only can be used to predict the size of the phrase table, but also can be a valuable reference for investigating the relationship between the translation performance and phrase tables based on different links of word alignment. The corpus-motivated pruning results show that nearly 98% of phrases can be reduced without any significant loss in translation quality.
format Online
Article
Text
id pubmed-4030493
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Hindawi Publishing Corporation
record_format MEDLINE/PubMed
spelling pubmed-40304932014-06-01 A Relationship: Word Alignment, Phrase Table, and Translation Quality Tian, Liang Wong, Derek F. Chao, Lidia S. Oliveira, Francisco ScientificWorldJournal Research Article In the last years, researchers conducted several studies to evaluate the machine translation quality based on the relationship between word alignments and phrase table. However, existing methods usually employ ad-hoc heuristics without theoretical support. So far, there is no discussion from the aspect of providing a formula to describe the relationship among word alignments, phrase table, and machine translation performance. In this paper, on one hand, we focus on formulating such a relationship for estimating the size of extracted phrase pairs given one or more word alignment points. On the other hand, a corpus-motivated pruning technique is proposed to prune the default large phrase table. Experiment proves that the deduced formula is feasible, which not only can be used to predict the size of the phrase table, but also can be a valuable reference for investigating the relationship between the translation performance and phrase tables based on different links of word alignment. The corpus-motivated pruning results show that nearly 98% of phrases can be reduced without any significant loss in translation quality. Hindawi Publishing Corporation 2014 2014-04-16 /pmc/articles/PMC4030493/ /pubmed/24883402 http://dx.doi.org/10.1155/2014/438106 Text en Copyright © 2014 Liang Tian et al. https://creativecommons.org/licenses/by/3.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Tian, Liang
Wong, Derek F.
Chao, Lidia S.
Oliveira, Francisco
A Relationship: Word Alignment, Phrase Table, and Translation Quality
title A Relationship: Word Alignment, Phrase Table, and Translation Quality
title_full A Relationship: Word Alignment, Phrase Table, and Translation Quality
title_fullStr A Relationship: Word Alignment, Phrase Table, and Translation Quality
title_full_unstemmed A Relationship: Word Alignment, Phrase Table, and Translation Quality
title_short A Relationship: Word Alignment, Phrase Table, and Translation Quality
title_sort relationship: word alignment, phrase table, and translation quality
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4030493/
https://www.ncbi.nlm.nih.gov/pubmed/24883402
http://dx.doi.org/10.1155/2014/438106
work_keys_str_mv AT tianliang arelationshipwordalignmentphrasetableandtranslationquality
AT wongderekf arelationshipwordalignmentphrasetableandtranslationquality
AT chaolidias arelationshipwordalignmentphrasetableandtranslationquality
AT oliveirafrancisco arelationshipwordalignmentphrasetableandtranslationquality
AT tianliang relationshipwordalignmentphrasetableandtranslationquality
AT wongderekf relationshipwordalignmentphrasetableandtranslationquality
AT chaolidias relationshipwordalignmentphrasetableandtranslationquality
AT oliveirafrancisco relationshipwordalignmentphrasetableandtranslationquality