Cargando…

Beyond the Transformer: A Novel Polynomial Inherent Attention (PIA) Model and Its Great Impact on Neural Machine Translation

This paper describes a novel polynomial inherent attention (PIA) model that outperforms all state-of-the-art transformer models on neural machine translation (NMT) by a wide margin. PIA is based on the simple idea that natural language sentences can be transformed into a special type of binary atten...

Descripción completa

Detalles Bibliográficos
Autores principales: ELAffendi, Mohammed, Alrajhi, Khawlah
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9519290/
https://www.ncbi.nlm.nih.gov/pubmed/36188704
http://dx.doi.org/10.1155/2022/1912750
Descripción
Sumario:This paper describes a novel polynomial inherent attention (PIA) model that outperforms all state-of-the-art transformer models on neural machine translation (NMT) by a wide margin. PIA is based on the simple idea that natural language sentences can be transformed into a special type of binary attention context vectors that accurately capture the semantic context and the relative dependencies between words in a sentence. The transformation is performed using a simple power-of-two polynomial transformation that maintains strict consistent positioning of words in the resulting vectors. It is shown how this transformation reduces the neural machine translation process to a simple neural polynomial regression model that provides excellent solutions to the alignment and positioning problems haunting transformer models. The test BELU scores obtained on the WMT-2014 data set are 75.07 BELU for the EN-FR data set and 66.35 BELU for the EN-DE data set—well above accuracies achieved by state-of-the-art transformer models for the same data sets. The improvements are, respectively, 65.7% and 87.42%.