Cargando…
Long-Range Correlation Underlying Childhood Language and Generative Models
Long-range correlation, a property of time series exhibiting relevant statistical dependence between two distant subsequences, is mainly studied in the statistical physics domain and has been reported to exist in natural language. By using a state-of-the-art method for such analysis, long-range corr...
Autor principal: | |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6157415/ https://www.ncbi.nlm.nih.gov/pubmed/30283378 http://dx.doi.org/10.3389/fpsyg.2018.01725 |
_version_ | 1783358267491090432 |
---|---|
author | Tanaka-Ishii, Kumiko |
author_facet | Tanaka-Ishii, Kumiko |
author_sort | Tanaka-Ishii, Kumiko |
collection | PubMed |
description | Long-range correlation, a property of time series exhibiting relevant statistical dependence between two distant subsequences, is mainly studied in the statistical physics domain and has been reported to exist in natural language. By using a state-of-the-art method for such analysis, long-range correlation is first shown to occur in long CHILDES data sets. To understand why, generative stochastic models of language, originally proposed in the cognitive scientific domain, are investigated. Among representative models, the Simon model is found to exhibit surprisingly good long-range correlation, but not the Pitman-Yor model. Because the Simon model is known not to correctly reflect the vocabulary growth of natural languages, a simple new model is devised as a conjunct of the Simon and Pitman-Yor models, such that long-range correlation holds with a correct vocabulary growth rate. The investigation overall suggests that uniform sampling is one cause of long-range correlation and could thus have some relation with actual linguistic processes. |
format | Online Article Text |
id | pubmed-6157415 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-61574152018-10-03 Long-Range Correlation Underlying Childhood Language and Generative Models Tanaka-Ishii, Kumiko Front Psychol Psychology Long-range correlation, a property of time series exhibiting relevant statistical dependence between two distant subsequences, is mainly studied in the statistical physics domain and has been reported to exist in natural language. By using a state-of-the-art method for such analysis, long-range correlation is first shown to occur in long CHILDES data sets. To understand why, generative stochastic models of language, originally proposed in the cognitive scientific domain, are investigated. Among representative models, the Simon model is found to exhibit surprisingly good long-range correlation, but not the Pitman-Yor model. Because the Simon model is known not to correctly reflect the vocabulary growth of natural languages, a simple new model is devised as a conjunct of the Simon and Pitman-Yor models, such that long-range correlation holds with a correct vocabulary growth rate. The investigation overall suggests that uniform sampling is one cause of long-range correlation and could thus have some relation with actual linguistic processes. Frontiers Media S.A. 2018-09-19 /pmc/articles/PMC6157415/ /pubmed/30283378 http://dx.doi.org/10.3389/fpsyg.2018.01725 Text en Copyright © 2018 Tanaka-Ishii. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Psychology Tanaka-Ishii, Kumiko Long-Range Correlation Underlying Childhood Language and Generative Models |
title | Long-Range Correlation Underlying Childhood Language and Generative Models |
title_full | Long-Range Correlation Underlying Childhood Language and Generative Models |
title_fullStr | Long-Range Correlation Underlying Childhood Language and Generative Models |
title_full_unstemmed | Long-Range Correlation Underlying Childhood Language and Generative Models |
title_short | Long-Range Correlation Underlying Childhood Language and Generative Models |
title_sort | long-range correlation underlying childhood language and generative models |
topic | Psychology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6157415/ https://www.ncbi.nlm.nih.gov/pubmed/30283378 http://dx.doi.org/10.3389/fpsyg.2018.01725 |
work_keys_str_mv | AT tanakaishiikumiko longrangecorrelationunderlyingchildhoodlanguageandgenerativemodels |