Cargando…
Explaining pretrained language models' understanding of linguistic structures using construction grammar
Construction Grammar (CxG) is a paradigm from cognitive linguistics emphasizing the connection between syntax and semantics. Rather than rules that operate on lexical items, it posits constructions as the central building blocks of language, i.e., linguistic units of different granularity that combi...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10600487/ https://www.ncbi.nlm.nih.gov/pubmed/37899964 http://dx.doi.org/10.3389/frai.2023.1225791 |
_version_ | 1785125997470810112 |
---|---|
author | Weissweiler, Leonie Hofmann, Valentin Köksal, Abdullatif Schütze, Hinrich |
author_facet | Weissweiler, Leonie Hofmann, Valentin Köksal, Abdullatif Schütze, Hinrich |
author_sort | Weissweiler, Leonie |
collection | PubMed |
description | Construction Grammar (CxG) is a paradigm from cognitive linguistics emphasizing the connection between syntax and semantics. Rather than rules that operate on lexical items, it posits constructions as the central building blocks of language, i.e., linguistic units of different granularity that combine syntax and semantics. As a first step toward assessing the compatibility of CxG with the syntactic and semantic knowledge demonstrated by state-of-the-art pretrained language models (PLMs), we present an investigation of their capability to classify and understand one of the most commonly studied constructions, the English comparative correlative (CC). We conduct experiments examining the classification accuracy of a syntactic probe on the one hand and the models' behavior in a semantic application task on the other, with BERT, RoBERTa, and DeBERTa as the example PLMs. Our results show that all three investigated PLMs, as well as OPT, are able to recognize the structure of the CC but fail to use its meaning. While human-like performance of PLMs on many NLP tasks has been alleged, this indicates that PLMs still suffer from substantial shortcomings in central domains of linguistic knowledge. |
format | Online Article Text |
id | pubmed-10600487 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-106004872023-10-27 Explaining pretrained language models' understanding of linguistic structures using construction grammar Weissweiler, Leonie Hofmann, Valentin Köksal, Abdullatif Schütze, Hinrich Front Artif Intell Artificial Intelligence Construction Grammar (CxG) is a paradigm from cognitive linguistics emphasizing the connection between syntax and semantics. Rather than rules that operate on lexical items, it posits constructions as the central building blocks of language, i.e., linguistic units of different granularity that combine syntax and semantics. As a first step toward assessing the compatibility of CxG with the syntactic and semantic knowledge demonstrated by state-of-the-art pretrained language models (PLMs), we present an investigation of their capability to classify and understand one of the most commonly studied constructions, the English comparative correlative (CC). We conduct experiments examining the classification accuracy of a syntactic probe on the one hand and the models' behavior in a semantic application task on the other, with BERT, RoBERTa, and DeBERTa as the example PLMs. Our results show that all three investigated PLMs, as well as OPT, are able to recognize the structure of the CC but fail to use its meaning. While human-like performance of PLMs on many NLP tasks has been alleged, this indicates that PLMs still suffer from substantial shortcomings in central domains of linguistic knowledge. Frontiers Media S.A. 2023-10-12 /pmc/articles/PMC10600487/ /pubmed/37899964 http://dx.doi.org/10.3389/frai.2023.1225791 Text en Copyright © 2023 Weissweiler, Hofmann, Köksal and Schütze. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Artificial Intelligence Weissweiler, Leonie Hofmann, Valentin Köksal, Abdullatif Schütze, Hinrich Explaining pretrained language models' understanding of linguistic structures using construction grammar |
title | Explaining pretrained language models' understanding of linguistic structures using construction grammar |
title_full | Explaining pretrained language models' understanding of linguistic structures using construction grammar |
title_fullStr | Explaining pretrained language models' understanding of linguistic structures using construction grammar |
title_full_unstemmed | Explaining pretrained language models' understanding of linguistic structures using construction grammar |
title_short | Explaining pretrained language models' understanding of linguistic structures using construction grammar |
title_sort | explaining pretrained language models' understanding of linguistic structures using construction grammar |
topic | Artificial Intelligence |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10600487/ https://www.ncbi.nlm.nih.gov/pubmed/37899964 http://dx.doi.org/10.3389/frai.2023.1225791 |
work_keys_str_mv | AT weissweilerleonie explainingpretrainedlanguagemodelsunderstandingoflinguisticstructuresusingconstructiongrammar AT hofmannvalentin explainingpretrainedlanguagemodelsunderstandingoflinguisticstructuresusingconstructiongrammar AT koksalabdullatif explainingpretrainedlanguagemodelsunderstandingoflinguisticstructuresusingconstructiongrammar AT schutzehinrich explainingpretrainedlanguagemodelsunderstandingoflinguisticstructuresusingconstructiongrammar |