Cargando…

Recognizing Semantic Relations: Attention-Based Transformers vs. Recurrent Models

Automatically recognizing an existing semantic relation (such as “is a”, “part of”, “property of”, “opposite of” etc.) between two arbitrary words (phrases, concepts, etc.) is an important task affecting many information retrieval and artificial intelligence tasks including query expansion, common-s...

Descripción completa

Detalles Bibliográficos
Autores principales: Roussinov, Dmitri, Sharoff, Serge, Puchnina, Nadezhda
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7148207/
http://dx.doi.org/10.1007/978-3-030-45439-5_37
Descripción
Sumario:Automatically recognizing an existing semantic relation (such as “is a”, “part of”, “property of”, “opposite of” etc.) between two arbitrary words (phrases, concepts, etc.) is an important task affecting many information retrieval and artificial intelligence tasks including query expansion, common-sense reasoning, question answering, and database federation. Currently, two classes of approaches exist to classify a relation between words (concepts) X and Y: (1) path-based and (2) distributional. While the path-based approaches look at word-paths connecting X and Y in text, the distributional approaches look at statistical properties of X and Y separately, not necessary in the proximity of each other. Here, we suggest how both types can be improved and empirically compare them using several standard benchmarking datasets. For our distributional approach, we are suggesting using an attention-based transformer. While they are known to be capable of supporting knowledge transfer between different tasks, and recently set a number of benchmarking records in various applications, we are the first to successfully apply them to the task of recognizing semantic relations. To improve a path-based approach, we are suggesting our original neural word path model that combines useful properties of convolutional and recurrent networks, and thus addressing several shortcomings from the prior path-based models. Both our models significantly outperforms the state-of-the-art within its type accordingly. Our transformer-based approach outperforms current state-of-the-art by 1–12% points on 4 out of 6 standard benchmarking datasets. This results in 15–40% error reduction and is closing the gap between the automated and human performance by up to 50%. It also needs much less training data than prior approaches. For the ease of re-producing our results, we make our source code and trained models publicly available.