Cargando…

Multi-time resolution analysis of speech: evidence from psychophysics

How speech signals are analyzed and represented remains a foundational challenge both for cognitive science and neuroscience. A growing body of research, employing various behavioral and neurobiological experimental techniques, now points to the perceptual relevance of both phoneme-sized (10–40 Hz m...

Descripción completa

Detalles Bibliográficos
Autores principales: Chait, Maria, Greenberg, Steven, Arai, Takayuki, Simon, Jonathan Z., Poeppel, David
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4468943/
https://www.ncbi.nlm.nih.gov/pubmed/26136650
http://dx.doi.org/10.3389/fnins.2015.00214
_version_ 1782376572327559168
author Chait, Maria
Greenberg, Steven
Arai, Takayuki
Simon, Jonathan Z.
Poeppel, David
author_facet Chait, Maria
Greenberg, Steven
Arai, Takayuki
Simon, Jonathan Z.
Poeppel, David
author_sort Chait, Maria
collection PubMed
description How speech signals are analyzed and represented remains a foundational challenge both for cognitive science and neuroscience. A growing body of research, employing various behavioral and neurobiological experimental techniques, now points to the perceptual relevance of both phoneme-sized (10–40 Hz modulation frequency) and syllable-sized (2–10 Hz modulation frequency) units in speech processing. However, it is not clear how information associated with such different time scales interacts in a manner relevant for speech perception. We report behavioral experiments on speech intelligibility employing a stimulus that allows us to investigate how distinct temporal modulations in speech are treated separately and whether they are combined. We created sentences in which the slow (~4 Hz; S(low)) and rapid (~33 Hz; S(high)) modulations—corresponding to ~250 and ~30 ms, the average duration of syllables and certain phonetic properties, respectively—were selectively extracted. Although S(low) and S(high) have low intelligibility when presented separately, dichotic presentation of S(high) with S(low) results in supra-additive performance, suggesting a synergistic relationship between low- and high-modulation frequencies. A second experiment desynchronized presentation of the S(low) and S(high) signals. Desynchronizing signals relative to one another had no impact on intelligibility when delays were less than ~45 ms. Longer delays resulted in a steep intelligibility decline, providing further evidence of integration or binding of information within restricted temporal windows. Our data suggest that human speech perception uses multi-time resolution processing. Signals are concurrently analyzed on at least two separate time scales, the intermediate representations of these analyses are integrated, and the resulting bound percept has significant consequences for speech intelligibility—a view compatible with recent insights from neuroscience implicating multi-timescale auditory processing.
format Online
Article
Text
id pubmed-4468943
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-44689432015-07-01 Multi-time resolution analysis of speech: evidence from psychophysics Chait, Maria Greenberg, Steven Arai, Takayuki Simon, Jonathan Z. Poeppel, David Front Neurosci Psychology How speech signals are analyzed and represented remains a foundational challenge both for cognitive science and neuroscience. A growing body of research, employing various behavioral and neurobiological experimental techniques, now points to the perceptual relevance of both phoneme-sized (10–40 Hz modulation frequency) and syllable-sized (2–10 Hz modulation frequency) units in speech processing. However, it is not clear how information associated with such different time scales interacts in a manner relevant for speech perception. We report behavioral experiments on speech intelligibility employing a stimulus that allows us to investigate how distinct temporal modulations in speech are treated separately and whether they are combined. We created sentences in which the slow (~4 Hz; S(low)) and rapid (~33 Hz; S(high)) modulations—corresponding to ~250 and ~30 ms, the average duration of syllables and certain phonetic properties, respectively—were selectively extracted. Although S(low) and S(high) have low intelligibility when presented separately, dichotic presentation of S(high) with S(low) results in supra-additive performance, suggesting a synergistic relationship between low- and high-modulation frequencies. A second experiment desynchronized presentation of the S(low) and S(high) signals. Desynchronizing signals relative to one another had no impact on intelligibility when delays were less than ~45 ms. Longer delays resulted in a steep intelligibility decline, providing further evidence of integration or binding of information within restricted temporal windows. Our data suggest that human speech perception uses multi-time resolution processing. Signals are concurrently analyzed on at least two separate time scales, the intermediate representations of these analyses are integrated, and the resulting bound percept has significant consequences for speech intelligibility—a view compatible with recent insights from neuroscience implicating multi-timescale auditory processing. Frontiers Media S.A. 2015-06-16 /pmc/articles/PMC4468943/ /pubmed/26136650 http://dx.doi.org/10.3389/fnins.2015.00214 Text en Copyright © 2015 Chait, Greenberg, Arai, Simon and Poeppel. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Psychology
Chait, Maria
Greenberg, Steven
Arai, Takayuki
Simon, Jonathan Z.
Poeppel, David
Multi-time resolution analysis of speech: evidence from psychophysics
title Multi-time resolution analysis of speech: evidence from psychophysics
title_full Multi-time resolution analysis of speech: evidence from psychophysics
title_fullStr Multi-time resolution analysis of speech: evidence from psychophysics
title_full_unstemmed Multi-time resolution analysis of speech: evidence from psychophysics
title_short Multi-time resolution analysis of speech: evidence from psychophysics
title_sort multi-time resolution analysis of speech: evidence from psychophysics
topic Psychology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4468943/
https://www.ncbi.nlm.nih.gov/pubmed/26136650
http://dx.doi.org/10.3389/fnins.2015.00214
work_keys_str_mv AT chaitmaria multitimeresolutionanalysisofspeechevidencefrompsychophysics
AT greenbergsteven multitimeresolutionanalysisofspeechevidencefrompsychophysics
AT araitakayuki multitimeresolutionanalysisofspeechevidencefrompsychophysics
AT simonjonathanz multitimeresolutionanalysisofspeechevidencefrompsychophysics
AT poeppeldavid multitimeresolutionanalysisofspeechevidencefrompsychophysics