Cargando…

Languages cool as they expand: Allometric scaling and the decreasing need for new words

We analyze the occurrence frequencies of over 15 million words recorded in millions of books published during the past two centuries in seven different languages. For all languages and chronological subsets of the data we confirm that two scaling regimes characterize the word frequency distributions...

Descripción completa

Detalles Bibliográficos
Autores principales: Petersen, Alexander M., Tenenbaum, Joel N., Havlin, Shlomo, Stanley, H. Eugene, Perc, Matjaž
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3517984/
https://www.ncbi.nlm.nih.gov/pubmed/23230508
http://dx.doi.org/10.1038/srep00943
_version_ 1782252502667755520
author Petersen, Alexander M.
Tenenbaum, Joel N.
Havlin, Shlomo
Stanley, H. Eugene
Perc, Matjaž
author_facet Petersen, Alexander M.
Tenenbaum, Joel N.
Havlin, Shlomo
Stanley, H. Eugene
Perc, Matjaž
author_sort Petersen, Alexander M.
collection PubMed
description We analyze the occurrence frequencies of over 15 million words recorded in millions of books published during the past two centuries in seven different languages. For all languages and chronological subsets of the data we confirm that two scaling regimes characterize the word frequency distributions, with only the more common words obeying the classic Zipf law. Using corpora of unprecedented size, we test the allometric scaling relation between the corpus size and the vocabulary size of growing languages to demonstrate a decreasing marginal need for new words, a feature that is likely related to the underlying correlations between words. We calculate the annual growth fluctuations of word use which has a decreasing trend as the corpus size increases, indicating a slowdown in linguistic evolution following language expansion. This “cooling pattern” forms the basis of a third statistical regularity, which unlike the Zipf and the Heaps law, is dynamical in nature.
format Online
Article
Text
id pubmed-3517984
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Nature Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-35179842012-12-10 Languages cool as they expand: Allometric scaling and the decreasing need for new words Petersen, Alexander M. Tenenbaum, Joel N. Havlin, Shlomo Stanley, H. Eugene Perc, Matjaž Sci Rep Article We analyze the occurrence frequencies of over 15 million words recorded in millions of books published during the past two centuries in seven different languages. For all languages and chronological subsets of the data we confirm that two scaling regimes characterize the word frequency distributions, with only the more common words obeying the classic Zipf law. Using corpora of unprecedented size, we test the allometric scaling relation between the corpus size and the vocabulary size of growing languages to demonstrate a decreasing marginal need for new words, a feature that is likely related to the underlying correlations between words. We calculate the annual growth fluctuations of word use which has a decreasing trend as the corpus size increases, indicating a slowdown in linguistic evolution following language expansion. This “cooling pattern” forms the basis of a third statistical regularity, which unlike the Zipf and the Heaps law, is dynamical in nature. Nature Publishing Group 2012-12-10 /pmc/articles/PMC3517984/ /pubmed/23230508 http://dx.doi.org/10.1038/srep00943 Text en Copyright © 2012, Macmillan Publishers Limited. All rights reserved http://creativecommons.org/licenses/by-nc-sa/3.0/ This work is licensed under a Creative Commons Attribution-NonCommercial-ShareALike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/3.0/
spellingShingle Article
Petersen, Alexander M.
Tenenbaum, Joel N.
Havlin, Shlomo
Stanley, H. Eugene
Perc, Matjaž
Languages cool as they expand: Allometric scaling and the decreasing need for new words
title Languages cool as they expand: Allometric scaling and the decreasing need for new words
title_full Languages cool as they expand: Allometric scaling and the decreasing need for new words
title_fullStr Languages cool as they expand: Allometric scaling and the decreasing need for new words
title_full_unstemmed Languages cool as they expand: Allometric scaling and the decreasing need for new words
title_short Languages cool as they expand: Allometric scaling and the decreasing need for new words
title_sort languages cool as they expand: allometric scaling and the decreasing need for new words
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3517984/
https://www.ncbi.nlm.nih.gov/pubmed/23230508
http://dx.doi.org/10.1038/srep00943
work_keys_str_mv AT petersenalexanderm languagescoolastheyexpandallometricscalingandthedecreasingneedfornewwords
AT tenenbaumjoeln languagescoolastheyexpandallometricscalingandthedecreasingneedfornewwords
AT havlinshlomo languagescoolastheyexpandallometricscalingandthedecreasingneedfornewwords
AT stanleyheugene languagescoolastheyexpandallometricscalingandthedecreasingneedfornewwords
AT percmatjaz languagescoolastheyexpandallometricscalingandthedecreasingneedfornewwords