Cargando…
Comparison of Structural Parsers and Neural Language Models as Surprisal Estimators
Expectation-based theories of sentence processing posit that processing difficulty is determined by predictability in context. While predictability quantified via surprisal has gained empirical support, this representation-agnostic measure leaves open the question of how to best approximate the huma...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8929193/ https://www.ncbi.nlm.nih.gov/pubmed/35310956 http://dx.doi.org/10.3389/frai.2022.777963 |
_version_ | 1784670805496430592 |
---|---|
author | Oh, Byung-Doh Clark, Christian Schuler, William |
author_facet | Oh, Byung-Doh Clark, Christian Schuler, William |
author_sort | Oh, Byung-Doh |
collection | PubMed |
description | Expectation-based theories of sentence processing posit that processing difficulty is determined by predictability in context. While predictability quantified via surprisal has gained empirical support, this representation-agnostic measure leaves open the question of how to best approximate the human comprehender's latent probability model. This article first describes an incremental left-corner parser that incorporates information about common linguistic abstractions such as syntactic categories, predicate-argument structure, and morphological rules as a computational-level model of sentence processing. The article then evaluates a variety of structural parsers and deep neural language models as cognitive models of sentence processing by comparing the predictive power of their surprisal estimates on self-paced reading, eye-tracking, and fMRI data collected during real-time language processing. The results show that surprisal estimates from the proposed left-corner processing model deliver comparable and often superior fits to self-paced reading and eye-tracking data when compared to those from neural language models trained on much more data. This may suggest that the strong linguistic generalizations made by the proposed processing model may help predict humanlike processing costs that manifest in latency-based measures, even when the amount of training data is limited. Additionally, experiments using Transformer-based language models sharing the same primary architecture and training data show a surprising negative correlation between parameter count and fit to self-paced reading and eye-tracking data. These findings suggest that large-scale neural language models are making weaker generalizations based on patterns of lexical items rather than stronger, more humanlike generalizations based on linguistic structure. |
format | Online Article Text |
id | pubmed-8929193 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-89291932022-03-18 Comparison of Structural Parsers and Neural Language Models as Surprisal Estimators Oh, Byung-Doh Clark, Christian Schuler, William Front Artif Intell Artificial Intelligence Expectation-based theories of sentence processing posit that processing difficulty is determined by predictability in context. While predictability quantified via surprisal has gained empirical support, this representation-agnostic measure leaves open the question of how to best approximate the human comprehender's latent probability model. This article first describes an incremental left-corner parser that incorporates information about common linguistic abstractions such as syntactic categories, predicate-argument structure, and morphological rules as a computational-level model of sentence processing. The article then evaluates a variety of structural parsers and deep neural language models as cognitive models of sentence processing by comparing the predictive power of their surprisal estimates on self-paced reading, eye-tracking, and fMRI data collected during real-time language processing. The results show that surprisal estimates from the proposed left-corner processing model deliver comparable and often superior fits to self-paced reading and eye-tracking data when compared to those from neural language models trained on much more data. This may suggest that the strong linguistic generalizations made by the proposed processing model may help predict humanlike processing costs that manifest in latency-based measures, even when the amount of training data is limited. Additionally, experiments using Transformer-based language models sharing the same primary architecture and training data show a surprising negative correlation between parameter count and fit to self-paced reading and eye-tracking data. These findings suggest that large-scale neural language models are making weaker generalizations based on patterns of lexical items rather than stronger, more humanlike generalizations based on linguistic structure. Frontiers Media S.A. 2022-03-03 /pmc/articles/PMC8929193/ /pubmed/35310956 http://dx.doi.org/10.3389/frai.2022.777963 Text en Copyright © 2022 Oh, Clark and Schuler. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Artificial Intelligence Oh, Byung-Doh Clark, Christian Schuler, William Comparison of Structural Parsers and Neural Language Models as Surprisal Estimators |
title | Comparison of Structural Parsers and Neural Language Models as Surprisal Estimators |
title_full | Comparison of Structural Parsers and Neural Language Models as Surprisal Estimators |
title_fullStr | Comparison of Structural Parsers and Neural Language Models as Surprisal Estimators |
title_full_unstemmed | Comparison of Structural Parsers and Neural Language Models as Surprisal Estimators |
title_short | Comparison of Structural Parsers and Neural Language Models as Surprisal Estimators |
title_sort | comparison of structural parsers and neural language models as surprisal estimators |
topic | Artificial Intelligence |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8929193/ https://www.ncbi.nlm.nih.gov/pubmed/35310956 http://dx.doi.org/10.3389/frai.2022.777963 |
work_keys_str_mv | AT ohbyungdoh comparisonofstructuralparsersandneurallanguagemodelsassurprisalestimators AT clarkchristian comparisonofstructuralparsersandneurallanguagemodelsassurprisalestimators AT schulerwilliam comparisonofstructuralparsersandneurallanguagemodelsassurprisalestimators |