Cargando…

Structured Semantic Knowledge Can Emerge Automatically from Predicting Word Sequences in Child-Directed Speech

Previous research has suggested that distributional learning mechanisms may contribute to the acquisition of semantic knowledge. However, distributional learning mechanisms, statistical learning, and contemporary “deep learning” approaches have been criticized for being incapable of learning the kin...

Descripción completa

Detalles Bibliográficos
Autores principales: Huebner, Philip A., Willits, Jon A.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2018
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5827184/
https://www.ncbi.nlm.nih.gov/pubmed/29520243
http://dx.doi.org/10.3389/fpsyg.2018.00133
_version_ 1783302441723232256
author Huebner, Philip A.
Willits, Jon A.
author_facet Huebner, Philip A.
Willits, Jon A.
author_sort Huebner, Philip A.
collection PubMed
description Previous research has suggested that distributional learning mechanisms may contribute to the acquisition of semantic knowledge. However, distributional learning mechanisms, statistical learning, and contemporary “deep learning” approaches have been criticized for being incapable of learning the kind of abstract and structured knowledge that many think is required for acquisition of semantic knowledge. In this paper, we show that recurrent neural networks, trained on noisy naturalistic speech to children, do in fact learn what appears to be abstract and structured knowledge. We trained two types of recurrent neural networks (Simple Recurrent Network, and Long Short-Term Memory) to predict word sequences in a 5-million-word corpus of speech directed to children ages 0–3 years old, and assessed what semantic knowledge they acquired. We found that learned internal representations are encoding various abstract grammatical and semantic features that are useful for predicting word sequences. Assessing the organization of semantic knowledge in terms of the similarity structure, we found evidence of emergent categorical and hierarchical structure in both models. We found that the Long Short-term Memory (LSTM) and SRN are both learning very similar kinds of representations, but the LSTM achieved higher levels of performance on a quantitative evaluation. We also trained a non-recurrent neural network, Skip-gram, on the same input to compare our results to the state-of-the-art in machine learning. We found that Skip-gram achieves relatively similar performance to the LSTM, but is representing words more in terms of thematic compared to taxonomic relations, and we provide reasons why this might be the case. Our findings show that a learning system that derives abstract, distributed representations for the purpose of predicting sequential dependencies in naturalistic language may provide insight into emergence of many properties of the developing semantic system.
format Online
Article
Text
id pubmed-5827184
institution National Center for Biotechnology Information
language English
publishDate 2018
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-58271842018-03-08 Structured Semantic Knowledge Can Emerge Automatically from Predicting Word Sequences in Child-Directed Speech Huebner, Philip A. Willits, Jon A. Front Psychol Psychology Previous research has suggested that distributional learning mechanisms may contribute to the acquisition of semantic knowledge. However, distributional learning mechanisms, statistical learning, and contemporary “deep learning” approaches have been criticized for being incapable of learning the kind of abstract and structured knowledge that many think is required for acquisition of semantic knowledge. In this paper, we show that recurrent neural networks, trained on noisy naturalistic speech to children, do in fact learn what appears to be abstract and structured knowledge. We trained two types of recurrent neural networks (Simple Recurrent Network, and Long Short-Term Memory) to predict word sequences in a 5-million-word corpus of speech directed to children ages 0–3 years old, and assessed what semantic knowledge they acquired. We found that learned internal representations are encoding various abstract grammatical and semantic features that are useful for predicting word sequences. Assessing the organization of semantic knowledge in terms of the similarity structure, we found evidence of emergent categorical and hierarchical structure in both models. We found that the Long Short-term Memory (LSTM) and SRN are both learning very similar kinds of representations, but the LSTM achieved higher levels of performance on a quantitative evaluation. We also trained a non-recurrent neural network, Skip-gram, on the same input to compare our results to the state-of-the-art in machine learning. We found that Skip-gram achieves relatively similar performance to the LSTM, but is representing words more in terms of thematic compared to taxonomic relations, and we provide reasons why this might be the case. Our findings show that a learning system that derives abstract, distributed representations for the purpose of predicting sequential dependencies in naturalistic language may provide insight into emergence of many properties of the developing semantic system. Frontiers Media S.A. 2018-02-22 /pmc/articles/PMC5827184/ /pubmed/29520243 http://dx.doi.org/10.3389/fpsyg.2018.00133 Text en Copyright © 2018 Huebner and Willits. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Psychology
Huebner, Philip A.
Willits, Jon A.
Structured Semantic Knowledge Can Emerge Automatically from Predicting Word Sequences in Child-Directed Speech
title Structured Semantic Knowledge Can Emerge Automatically from Predicting Word Sequences in Child-Directed Speech
title_full Structured Semantic Knowledge Can Emerge Automatically from Predicting Word Sequences in Child-Directed Speech
title_fullStr Structured Semantic Knowledge Can Emerge Automatically from Predicting Word Sequences in Child-Directed Speech
title_full_unstemmed Structured Semantic Knowledge Can Emerge Automatically from Predicting Word Sequences in Child-Directed Speech
title_short Structured Semantic Knowledge Can Emerge Automatically from Predicting Word Sequences in Child-Directed Speech
title_sort structured semantic knowledge can emerge automatically from predicting word sequences in child-directed speech
topic Psychology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5827184/
https://www.ncbi.nlm.nih.gov/pubmed/29520243
http://dx.doi.org/10.3389/fpsyg.2018.00133
work_keys_str_mv AT huebnerphilipa structuredsemanticknowledgecanemergeautomaticallyfrompredictingwordsequencesinchilddirectedspeech
AT willitsjona structuredsemanticknowledgecanemergeautomaticallyfrompredictingwordsequencesinchilddirectedspeech