Cargando…

Zipf's Law in Short-Time Timbral Codings of Speech, Music, and Environmental Sound Signals

Timbre is a key perceptual feature that allows discrimination between different sounds. Timbral sensations are highly dependent on the temporal evolution of the power spectrum of an audio signal. In order to quantitatively characterize such sensations, the shape of the power spectrum has to be encod...

Descripción completa

Detalles Bibliográficos
Autores principales: Haro, Martín, Serrà, Joan, Herrera, Perfecto, Corral, Álvaro
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3315504/
https://www.ncbi.nlm.nih.gov/pubmed/22479497
http://dx.doi.org/10.1371/journal.pone.0033993
_version_ 1782228243742457856
author Haro, Martín
Serrà, Joan
Herrera, Perfecto
Corral, Álvaro
author_facet Haro, Martín
Serrà, Joan
Herrera, Perfecto
Corral, Álvaro
author_sort Haro, Martín
collection PubMed
description Timbre is a key perceptual feature that allows discrimination between different sounds. Timbral sensations are highly dependent on the temporal evolution of the power spectrum of an audio signal. In order to quantitatively characterize such sensations, the shape of the power spectrum has to be encoded in a way that preserves certain physical and perceptual properties. Therefore, it is common practice to encode short-time power spectra using psychoacoustical frequency scales. In this paper, we study and characterize the statistical properties of such encodings, here called timbral code-words. In particular, we report on rank-frequency distributions of timbral code-words extracted from 740 hours of audio coming from disparate sources such as speech, music, and environmental sounds. Analogously to text corpora, we find a heavy-tailed Zipfian distribution with exponent close to one. Importantly, this distribution is found independently of different encoding decisions and regardless of the audio source. Further analysis on the intrinsic characteristics of most and least frequent code-words reveals that the most frequent code-words tend to have a more homogeneous structure. We also find that speech and music databases have specific, distinctive code-words while, in the case of the environmental sounds, this database-specific code-words are not present. Finally, we find that a Yule-Simon process with memory provides a reasonable quantitative approximation for our data, suggesting the existence of a common simple generative mechanism for all considered sound sources.
format Online
Article
Text
id pubmed-3315504
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-33155042012-04-04 Zipf's Law in Short-Time Timbral Codings of Speech, Music, and Environmental Sound Signals Haro, Martín Serrà, Joan Herrera, Perfecto Corral, Álvaro PLoS One Research Article Timbre is a key perceptual feature that allows discrimination between different sounds. Timbral sensations are highly dependent on the temporal evolution of the power spectrum of an audio signal. In order to quantitatively characterize such sensations, the shape of the power spectrum has to be encoded in a way that preserves certain physical and perceptual properties. Therefore, it is common practice to encode short-time power spectra using psychoacoustical frequency scales. In this paper, we study and characterize the statistical properties of such encodings, here called timbral code-words. In particular, we report on rank-frequency distributions of timbral code-words extracted from 740 hours of audio coming from disparate sources such as speech, music, and environmental sounds. Analogously to text corpora, we find a heavy-tailed Zipfian distribution with exponent close to one. Importantly, this distribution is found independently of different encoding decisions and regardless of the audio source. Further analysis on the intrinsic characteristics of most and least frequent code-words reveals that the most frequent code-words tend to have a more homogeneous structure. We also find that speech and music databases have specific, distinctive code-words while, in the case of the environmental sounds, this database-specific code-words are not present. Finally, we find that a Yule-Simon process with memory provides a reasonable quantitative approximation for our data, suggesting the existence of a common simple generative mechanism for all considered sound sources. Public Library of Science 2012-03-29 /pmc/articles/PMC3315504/ /pubmed/22479497 http://dx.doi.org/10.1371/journal.pone.0033993 Text en Haro et al. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Haro, Martín
Serrà, Joan
Herrera, Perfecto
Corral, Álvaro
Zipf's Law in Short-Time Timbral Codings of Speech, Music, and Environmental Sound Signals
title Zipf's Law in Short-Time Timbral Codings of Speech, Music, and Environmental Sound Signals
title_full Zipf's Law in Short-Time Timbral Codings of Speech, Music, and Environmental Sound Signals
title_fullStr Zipf's Law in Short-Time Timbral Codings of Speech, Music, and Environmental Sound Signals
title_full_unstemmed Zipf's Law in Short-Time Timbral Codings of Speech, Music, and Environmental Sound Signals
title_short Zipf's Law in Short-Time Timbral Codings of Speech, Music, and Environmental Sound Signals
title_sort zipf's law in short-time timbral codings of speech, music, and environmental sound signals
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3315504/
https://www.ncbi.nlm.nih.gov/pubmed/22479497
http://dx.doi.org/10.1371/journal.pone.0033993
work_keys_str_mv AT haromartin zipfslawinshorttimetimbralcodingsofspeechmusicandenvironmentalsoundsignals
AT serrajoan zipfslawinshorttimetimbralcodingsofspeechmusicandenvironmentalsoundsignals
AT herreraperfecto zipfslawinshorttimetimbralcodingsofspeechmusicandenvironmentalsoundsignals
AT corralalvaro zipfslawinshorttimetimbralcodingsofspeechmusicandenvironmentalsoundsignals