Cargando…

Beyond Word Frequency: Bursts, Lulls, and Scaling in the Temporal Distributions of Words

BACKGROUND: Zipf's discovery that word frequency distributions obey a power law established parallels between biological and physical processes, and language, laying the groundwork for a complex systems perspective on human communication. More recent research has also identified scaling regular...

Descripción completa

Detalles Bibliográficos
Autores principales: Altmann, Eduardo G., Pierrehumbert, Janet B., Motter, Adilson E.
Formato: Texto
Lenguaje:English
Publicado: Public Library of Science 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2770836/
https://www.ncbi.nlm.nih.gov/pubmed/19907645
http://dx.doi.org/10.1371/journal.pone.0007678
_version_ 1782173705943646208
author Altmann, Eduardo G.
Pierrehumbert, Janet B.
Motter, Adilson E.
author_facet Altmann, Eduardo G.
Pierrehumbert, Janet B.
Motter, Adilson E.
author_sort Altmann, Eduardo G.
collection PubMed
description BACKGROUND: Zipf's discovery that word frequency distributions obey a power law established parallels between biological and physical processes, and language, laying the groundwork for a complex systems perspective on human communication. More recent research has also identified scaling regularities in the dynamics underlying the successive occurrences of events, suggesting the possibility of similar findings for language as well. METHODOLOGY/PRINCIPAL FINDINGS: By considering frequent words in USENET discussion groups and in disparate databases where the language has different levels of formality, here we show that the distributions of distances between successive occurrences of the same word display bursty deviations from a Poisson process and are well characterized by a stretched exponential (Weibull) scaling. The extent of this deviation depends strongly on semantic type – a measure of the logicality of each word – and less strongly on frequency. We develop a generative model of this behavior that fully determines the dynamics of word usage. CONCLUSIONS/SIGNIFICANCE: Recurrence patterns of words are well described by a stretched exponential distribution of recurrence times, an empirical scaling that cannot be anticipated from Zipf's law. Because the use of words provides a uniquely precise and powerful lens on human thought and activity, our findings also have implications for other overt manifestations of collective human dynamics.
format Text
id pubmed-2770836
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-27708362009-11-11 Beyond Word Frequency: Bursts, Lulls, and Scaling in the Temporal Distributions of Words Altmann, Eduardo G. Pierrehumbert, Janet B. Motter, Adilson E. PLoS One Research Article BACKGROUND: Zipf's discovery that word frequency distributions obey a power law established parallels between biological and physical processes, and language, laying the groundwork for a complex systems perspective on human communication. More recent research has also identified scaling regularities in the dynamics underlying the successive occurrences of events, suggesting the possibility of similar findings for language as well. METHODOLOGY/PRINCIPAL FINDINGS: By considering frequent words in USENET discussion groups and in disparate databases where the language has different levels of formality, here we show that the distributions of distances between successive occurrences of the same word display bursty deviations from a Poisson process and are well characterized by a stretched exponential (Weibull) scaling. The extent of this deviation depends strongly on semantic type – a measure of the logicality of each word – and less strongly on frequency. We develop a generative model of this behavior that fully determines the dynamics of word usage. CONCLUSIONS/SIGNIFICANCE: Recurrence patterns of words are well described by a stretched exponential distribution of recurrence times, an empirical scaling that cannot be anticipated from Zipf's law. Because the use of words provides a uniquely precise and powerful lens on human thought and activity, our findings also have implications for other overt manifestations of collective human dynamics. Public Library of Science 2009-11-11 /pmc/articles/PMC2770836/ /pubmed/19907645 http://dx.doi.org/10.1371/journal.pone.0007678 Text en Altmann et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Altmann, Eduardo G.
Pierrehumbert, Janet B.
Motter, Adilson E.
Beyond Word Frequency: Bursts, Lulls, and Scaling in the Temporal Distributions of Words
title Beyond Word Frequency: Bursts, Lulls, and Scaling in the Temporal Distributions of Words
title_full Beyond Word Frequency: Bursts, Lulls, and Scaling in the Temporal Distributions of Words
title_fullStr Beyond Word Frequency: Bursts, Lulls, and Scaling in the Temporal Distributions of Words
title_full_unstemmed Beyond Word Frequency: Bursts, Lulls, and Scaling in the Temporal Distributions of Words
title_short Beyond Word Frequency: Bursts, Lulls, and Scaling in the Temporal Distributions of Words
title_sort beyond word frequency: bursts, lulls, and scaling in the temporal distributions of words
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2770836/
https://www.ncbi.nlm.nih.gov/pubmed/19907645
http://dx.doi.org/10.1371/journal.pone.0007678
work_keys_str_mv AT altmanneduardog beyondwordfrequencyburstslullsandscalinginthetemporaldistributionsofwords
AT pierrehumbertjanetb beyondwordfrequencyburstslullsandscalinginthetemporaldistributionsofwords
AT motteradilsone beyondwordfrequencyburstslullsandscalinginthetemporaldistributionsofwords