Cargando…

Zipf’s Law Arises Naturally When There Are Underlying, Unobserved Variables

Zipf’s law, which states that the probability of an observation is inversely proportional to its rank, has been observed in many domains. While there are models that explain Zipf’s law in each of them, those explanations are typically domain specific. Recently, methods from statistical physics were...

Descripción completa

Detalles Bibliográficos
Autores principales: Aitchison, Laurence, Corradi, Nicola, Latham, Peter E.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5172588/
https://www.ncbi.nlm.nih.gov/pubmed/27997544
http://dx.doi.org/10.1371/journal.pcbi.1005110
_version_ 1782484155992375296
author Aitchison, Laurence
Corradi, Nicola
Latham, Peter E.
author_facet Aitchison, Laurence
Corradi, Nicola
Latham, Peter E.
author_sort Aitchison, Laurence
collection PubMed
description Zipf’s law, which states that the probability of an observation is inversely proportional to its rank, has been observed in many domains. While there are models that explain Zipf’s law in each of them, those explanations are typically domain specific. Recently, methods from statistical physics were used to show that a fairly broad class of models does provide a general explanation of Zipf’s law. This explanation rests on the observation that real world data is often generated from underlying causes, known as latent variables. Those latent variables mix together multiple models that do not obey Zipf’s law, giving a model that does. Here we extend that work both theoretically and empirically. Theoretically, we provide a far simpler and more intuitive explanation of Zipf’s law, which at the same time considerably extends the class of models to which this explanation can apply. Furthermore, we also give methods for verifying whether this explanation applies to a particular dataset. Empirically, these advances allowed us extend this explanation to important classes of data, including word frequencies (the first domain in which Zipf’s law was discovered), data with variable sequence length, and multi-neuron spiking activity.
format Online
Article
Text
id pubmed-5172588
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-51725882017-01-04 Zipf’s Law Arises Naturally When There Are Underlying, Unobserved Variables Aitchison, Laurence Corradi, Nicola Latham, Peter E. PLoS Comput Biol Research Article Zipf’s law, which states that the probability of an observation is inversely proportional to its rank, has been observed in many domains. While there are models that explain Zipf’s law in each of them, those explanations are typically domain specific. Recently, methods from statistical physics were used to show that a fairly broad class of models does provide a general explanation of Zipf’s law. This explanation rests on the observation that real world data is often generated from underlying causes, known as latent variables. Those latent variables mix together multiple models that do not obey Zipf’s law, giving a model that does. Here we extend that work both theoretically and empirically. Theoretically, we provide a far simpler and more intuitive explanation of Zipf’s law, which at the same time considerably extends the class of models to which this explanation can apply. Furthermore, we also give methods for verifying whether this explanation applies to a particular dataset. Empirically, these advances allowed us extend this explanation to important classes of data, including word frequencies (the first domain in which Zipf’s law was discovered), data with variable sequence length, and multi-neuron spiking activity. Public Library of Science 2016-12-20 /pmc/articles/PMC5172588/ /pubmed/27997544 http://dx.doi.org/10.1371/journal.pcbi.1005110 Text en © 2016 Aitchison et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Aitchison, Laurence
Corradi, Nicola
Latham, Peter E.
Zipf’s Law Arises Naturally When There Are Underlying, Unobserved Variables
title Zipf’s Law Arises Naturally When There Are Underlying, Unobserved Variables
title_full Zipf’s Law Arises Naturally When There Are Underlying, Unobserved Variables
title_fullStr Zipf’s Law Arises Naturally When There Are Underlying, Unobserved Variables
title_full_unstemmed Zipf’s Law Arises Naturally When There Are Underlying, Unobserved Variables
title_short Zipf’s Law Arises Naturally When There Are Underlying, Unobserved Variables
title_sort zipf’s law arises naturally when there are underlying, unobserved variables
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5172588/
https://www.ncbi.nlm.nih.gov/pubmed/27997544
http://dx.doi.org/10.1371/journal.pcbi.1005110
work_keys_str_mv AT aitchisonlaurence zipfslawarisesnaturallywhenthereareunderlyingunobservedvariables
AT corradinicola zipfslawarisesnaturallywhenthereareunderlyingunobservedvariables
AT lathampetere zipfslawarisesnaturallywhenthereareunderlyingunobservedvariables