Cargando…

Schrödinger's tree—On syntax and neural language models

In the last half-decade, the field of natural language processing (NLP) has undergone two major transitions: the switch to neural networks as the primary modeling paradigm and the homogenization of the training regime (pre-train, then fine-tune). Amidst this process, language models have emerged as...

Descripción completa

Detalles Bibliográficos
Autores principales:	Kulmizev, Artur, Nivre, Joakim
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2022
Materias:	Artificial Intelligence
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9618648/ https://www.ncbi.nlm.nih.gov/pubmed/36325030 http://dx.doi.org/10.3389/frai.2022.796788

_version_	1784821097112272896
author	Kulmizev, Artur Nivre, Joakim
author_facet	Kulmizev, Artur Nivre, Joakim
author_sort	Kulmizev, Artur
collection	PubMed
description	In the last half-decade, the field of natural language processing (NLP) has undergone two major transitions: the switch to neural networks as the primary modeling paradigm and the homogenization of the training regime (pre-train, then fine-tune). Amidst this process, language models have emerged as NLP's workhorse, displaying increasingly fluent generation capabilities and proving to be an indispensable means of knowledge transfer downstream. Due to the otherwise opaque, black-box nature of such models, researchers have employed aspects of linguistic theory in order to characterize their behavior. Questions central to syntax—the study of the hierarchical structure of language—have factored heavily into such work, shedding invaluable insights about models' inherent biases and their ability to make human-like generalizations. In this paper, we attempt to take stock of this growing body of literature. In doing so, we observe a lack of clarity across numerous dimensions, which influences the hypotheses that researchers form, as well as the conclusions they draw from their findings. To remedy this, we urge researchers to make careful considerations when investigating coding properties, selecting representations, and evaluating via downstream tasks. Furthermore, we outline the implications of the different types of research questions exhibited in studies on syntax, as well as the inherent pitfalls of aggregate metrics. Ultimately, we hope that our discussion adds nuance to the prospect of studying language models and paves the way for a less monolithic perspective on syntax in this context.
format	Online Article Text
id	pubmed-9618648
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-96186482022-11-01 Schrödinger's tree—On syntax and neural language models Kulmizev, Artur Nivre, Joakim Front Artif Intell Artificial Intelligence In the last half-decade, the field of natural language processing (NLP) has undergone two major transitions: the switch to neural networks as the primary modeling paradigm and the homogenization of the training regime (pre-train, then fine-tune). Amidst this process, language models have emerged as NLP's workhorse, displaying increasingly fluent generation capabilities and proving to be an indispensable means of knowledge transfer downstream. Due to the otherwise opaque, black-box nature of such models, researchers have employed aspects of linguistic theory in order to characterize their behavior. Questions central to syntax—the study of the hierarchical structure of language—have factored heavily into such work, shedding invaluable insights about models' inherent biases and their ability to make human-like generalizations. In this paper, we attempt to take stock of this growing body of literature. In doing so, we observe a lack of clarity across numerous dimensions, which influences the hypotheses that researchers form, as well as the conclusions they draw from their findings. To remedy this, we urge researchers to make careful considerations when investigating coding properties, selecting representations, and evaluating via downstream tasks. Furthermore, we outline the implications of the different types of research questions exhibited in studies on syntax, as well as the inherent pitfalls of aggregate metrics. Ultimately, we hope that our discussion adds nuance to the prospect of studying language models and paves the way for a less monolithic perspective on syntax in this context. Frontiers Media S.A. 2022-10-17 /pmc/articles/PMC9618648/ /pubmed/36325030 http://dx.doi.org/10.3389/frai.2022.796788 Text en Copyright © 2022 Kulmizev and Nivre. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Artificial Intelligence Kulmizev, Artur Nivre, Joakim Schrödinger's tree—On syntax and neural language models
title	Schrödinger's tree—On syntax and neural language models
title_full	Schrödinger's tree—On syntax and neural language models
title_fullStr	Schrödinger's tree—On syntax and neural language models
title_full_unstemmed	Schrödinger's tree—On syntax and neural language models
title_short	Schrödinger's tree—On syntax and neural language models
title_sort	schrödinger's tree—on syntax and neural language models
topic	Artificial Intelligence
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9618648/ https://www.ncbi.nlm.nih.gov/pubmed/36325030 http://dx.doi.org/10.3389/frai.2022.796788
work_keys_str_mv	AT kulmizevartur schrodingerstreeonsyntaxandneurallanguagemodels AT nivrejoakim schrodingerstreeonsyntaxandneurallanguagemodels

Schrödinger's tree—On syntax and neural language models

Ejemplares similares