Cargando…

Synthesizing theories of human language with Bayesian program induction

Automated, data-driven construction and evaluation of scientific models and theories is a long-standing challenge in artificial intelligence. We present a framework for algorithmically synthesizing models of a basic part of human language: morpho-phonology, the system that builds word forms from sou...

Descripción completa

Detalles Bibliográficos
Autores principales:	Ellis, Kevin, Albright, Adam, Solar-Lezama, Armando, Tenenbaum, Joshua B., O’Donnell, Timothy J.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Nature Publishing Group UK 2022
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9427767/ https://www.ncbi.nlm.nih.gov/pubmed/36042196 http://dx.doi.org/10.1038/s41467-022-32012-w

_version_	1784778968390434816
author	Ellis, Kevin Albright, Adam Solar-Lezama, Armando Tenenbaum, Joshua B. O’Donnell, Timothy J.
author_facet	Ellis, Kevin Albright, Adam Solar-Lezama, Armando Tenenbaum, Joshua B. O’Donnell, Timothy J.
author_sort	Ellis, Kevin
collection	PubMed
description	Automated, data-driven construction and evaluation of scientific models and theories is a long-standing challenge in artificial intelligence. We present a framework for algorithmically synthesizing models of a basic part of human language: morpho-phonology, the system that builds word forms from sounds. We integrate Bayesian inference with program synthesis and representations inspired by linguistic theory and cognitive models of learning and discovery. Across 70 datasets from 58 diverse languages, our system synthesizes human-interpretable models for core aspects of each language’s morpho-phonology, sometimes approaching models posited by human linguists. Joint inference across all 70 data sets automatically synthesizes a meta-model encoding interpretable cross-language typological tendencies. Finally, the same algorithm captures few-shot learning dynamics, acquiring new morphophonological rules from just one or a few examples. These results suggest routes to more powerful machine-enabled discovery of interpretable models in linguistics and other scientific domains.
format	Online Article Text
id	pubmed-9427767
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Nature Publishing Group UK
record_format	MEDLINE/PubMed
spelling	pubmed-94277672022-09-01 Synthesizing theories of human language with Bayesian program induction Ellis, Kevin Albright, Adam Solar-Lezama, Armando Tenenbaum, Joshua B. O’Donnell, Timothy J. Nat Commun Article Automated, data-driven construction and evaluation of scientific models and theories is a long-standing challenge in artificial intelligence. We present a framework for algorithmically synthesizing models of a basic part of human language: morpho-phonology, the system that builds word forms from sounds. We integrate Bayesian inference with program synthesis and representations inspired by linguistic theory and cognitive models of learning and discovery. Across 70 datasets from 58 diverse languages, our system synthesizes human-interpretable models for core aspects of each language’s morpho-phonology, sometimes approaching models posited by human linguists. Joint inference across all 70 data sets automatically synthesizes a meta-model encoding interpretable cross-language typological tendencies. Finally, the same algorithm captures few-shot learning dynamics, acquiring new morphophonological rules from just one or a few examples. These results suggest routes to more powerful machine-enabled discovery of interpretable models in linguistics and other scientific domains. Nature Publishing Group UK 2022-08-30 /pmc/articles/PMC9427767/ /pubmed/36042196 http://dx.doi.org/10.1038/s41467-022-32012-w Text en © The Author(s) 2022 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) .
spellingShingle	Article Ellis, Kevin Albright, Adam Solar-Lezama, Armando Tenenbaum, Joshua B. O’Donnell, Timothy J. Synthesizing theories of human language with Bayesian program induction
title	Synthesizing theories of human language with Bayesian program induction
title_full	Synthesizing theories of human language with Bayesian program induction
title_fullStr	Synthesizing theories of human language with Bayesian program induction
title_full_unstemmed	Synthesizing theories of human language with Bayesian program induction
title_short	Synthesizing theories of human language with Bayesian program induction
title_sort	synthesizing theories of human language with bayesian program induction
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9427767/ https://www.ncbi.nlm.nih.gov/pubmed/36042196 http://dx.doi.org/10.1038/s41467-022-32012-w
work_keys_str_mv	AT elliskevin synthesizingtheoriesofhumanlanguagewithbayesianprograminduction AT albrightadam synthesizingtheoriesofhumanlanguagewithbayesianprograminduction AT solarlezamaarmando synthesizingtheoriesofhumanlanguagewithbayesianprograminduction AT tenenbaumjoshuab synthesizingtheoriesofhumanlanguagewithbayesianprograminduction AT odonnelltimothyj synthesizingtheoriesofhumanlanguagewithbayesianprograminduction

Synthesizing theories of human language with Bayesian program induction

Ejemplares similares