Cargando…

Do neural nets learn statistical laws behind natural language?

The performance of deep learning in natural language processing has been spectacular, but the reasons for this success remain unclear because of the inherent complexity of deep learning. This paper provides empirical evidence of its effectiveness and of a limitation of neural networks for language e...

Descripción completa

Detalles Bibliográficos
Autores principales: Takahashi, Shuntaro, Tanaka-Ishii, Kumiko
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5747447/
https://www.ncbi.nlm.nih.gov/pubmed/29287076
http://dx.doi.org/10.1371/journal.pone.0189326
_version_ 1783289277670490112
author Takahashi, Shuntaro
Tanaka-Ishii, Kumiko
author_facet Takahashi, Shuntaro
Tanaka-Ishii, Kumiko
author_sort Takahashi, Shuntaro
collection PubMed
description The performance of deep learning in natural language processing has been spectacular, but the reasons for this success remain unclear because of the inherent complexity of deep learning. This paper provides empirical evidence of its effectiveness and of a limitation of neural networks for language engineering. Precisely, we demonstrate that a neural language model based on long short-term memory (LSTM) effectively reproduces Zipf’s law and Heaps’ law, two representative statistical properties underlying natural language. We discuss the quality of reproducibility and the emergence of Zipf’s law and Heaps’ law as training progresses. We also point out that the neural language model has a limitation in reproducing long-range correlation, another statistical property of natural language. This understanding could provide a direction for improving the architectures of neural networks.
format Online
Article
Text
id pubmed-5747447
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-57474472018-01-26 Do neural nets learn statistical laws behind natural language? Takahashi, Shuntaro Tanaka-Ishii, Kumiko PLoS One Research Article The performance of deep learning in natural language processing has been spectacular, but the reasons for this success remain unclear because of the inherent complexity of deep learning. This paper provides empirical evidence of its effectiveness and of a limitation of neural networks for language engineering. Precisely, we demonstrate that a neural language model based on long short-term memory (LSTM) effectively reproduces Zipf’s law and Heaps’ law, two representative statistical properties underlying natural language. We discuss the quality of reproducibility and the emergence of Zipf’s law and Heaps’ law as training progresses. We also point out that the neural language model has a limitation in reproducing long-range correlation, another statistical property of natural language. This understanding could provide a direction for improving the architectures of neural networks. Public Library of Science 2017-12-29 /pmc/articles/PMC5747447/ /pubmed/29287076 http://dx.doi.org/10.1371/journal.pone.0189326 Text en © 2017 Takahashi, Tanaka-Ishii http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Takahashi, Shuntaro
Tanaka-Ishii, Kumiko
Do neural nets learn statistical laws behind natural language?
title Do neural nets learn statistical laws behind natural language?
title_full Do neural nets learn statistical laws behind natural language?
title_fullStr Do neural nets learn statistical laws behind natural language?
title_full_unstemmed Do neural nets learn statistical laws behind natural language?
title_short Do neural nets learn statistical laws behind natural language?
title_sort do neural nets learn statistical laws behind natural language?
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5747447/
https://www.ncbi.nlm.nih.gov/pubmed/29287076
http://dx.doi.org/10.1371/journal.pone.0189326
work_keys_str_mv AT takahashishuntaro doneuralnetslearnstatisticallawsbehindnaturallanguage
AT tanakaishiikumiko doneuralnetslearnstatisticallawsbehindnaturallanguage