Cargando…
Multimodal and Spectral Degradation Effects on Speech and Emotion Recognition in Adult Listeners
For cochlear implant (CI) users, degraded spectral input hampers the understanding of prosodic vocal emotion, especially in difficult listening conditions. Using a vocoder simulation of CI hearing, we examined the extent to which informative multimodal cues in a talker’s spoken expressions improve n...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
SAGE Publications
2018
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6236866/ https://www.ncbi.nlm.nih.gov/pubmed/30378469 http://dx.doi.org/10.1177/2331216518804966 |
_version_ | 1783371098213056512 |
---|---|
author | Ritter, Chantel Vongpaisal, Tara |
author_facet | Ritter, Chantel Vongpaisal, Tara |
author_sort | Ritter, Chantel |
collection | PubMed |
description | For cochlear implant (CI) users, degraded spectral input hampers the understanding of prosodic vocal emotion, especially in difficult listening conditions. Using a vocoder simulation of CI hearing, we examined the extent to which informative multimodal cues in a talker’s spoken expressions improve normal hearing (NH) adults’ speech and emotion perception under different levels of spectral degradation (two, three, four, and eight spectral bands). Participants repeated the words verbatim and identified emotions (among four alternative options: happy, sad, angry, and neutral) in meaningful sentences that are semantically congruent with the expression of the intended emotion. Sentences were presented in their natural speech form and in speech sampled through a noise-band vocoder in sound (auditory-only) and video (auditory–visual) recordings of a female talker. Visual information had a more pronounced benefit in enhancing speech recognition in the lower spectral band conditions. Spectral degradation, however, did not interfere with emotion recognition performance when dynamic visual cues in a talker’s expression are provided as participants scored at ceiling levels across all spectral band conditions. Our use of familiar sentences that contained congruent semantic and prosodic information have high ecological validity, which likely optimized listener performance under simulated CI hearing and may better predict CI users’ outcomes in everyday listening contexts. |
format | Online Article Text |
id | pubmed-6236866 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2018 |
publisher | SAGE Publications |
record_format | MEDLINE/PubMed |
spelling | pubmed-62368662018-11-19 Multimodal and Spectral Degradation Effects on Speech and Emotion Recognition in Adult Listeners Ritter, Chantel Vongpaisal, Tara Trends Hear Original Article For cochlear implant (CI) users, degraded spectral input hampers the understanding of prosodic vocal emotion, especially in difficult listening conditions. Using a vocoder simulation of CI hearing, we examined the extent to which informative multimodal cues in a talker’s spoken expressions improve normal hearing (NH) adults’ speech and emotion perception under different levels of spectral degradation (two, three, four, and eight spectral bands). Participants repeated the words verbatim and identified emotions (among four alternative options: happy, sad, angry, and neutral) in meaningful sentences that are semantically congruent with the expression of the intended emotion. Sentences were presented in their natural speech form and in speech sampled through a noise-band vocoder in sound (auditory-only) and video (auditory–visual) recordings of a female talker. Visual information had a more pronounced benefit in enhancing speech recognition in the lower spectral band conditions. Spectral degradation, however, did not interfere with emotion recognition performance when dynamic visual cues in a talker’s expression are provided as participants scored at ceiling levels across all spectral band conditions. Our use of familiar sentences that contained congruent semantic and prosodic information have high ecological validity, which likely optimized listener performance under simulated CI hearing and may better predict CI users’ outcomes in everyday listening contexts. SAGE Publications 2018-10-31 /pmc/articles/PMC6236866/ /pubmed/30378469 http://dx.doi.org/10.1177/2331216518804966 Text en © The Author(s) 2018 http://creativecommons.org/licenses/by-nc/4.0/ Creative Commons Non Commercial CC BY-NC: This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 4.0 License (http://www.creativecommons.org/licenses/by-nc/4.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/open-access-at-sage). |
spellingShingle | Original Article Ritter, Chantel Vongpaisal, Tara Multimodal and Spectral Degradation Effects on Speech and Emotion Recognition in Adult Listeners |
title | Multimodal and Spectral Degradation Effects on Speech and Emotion
Recognition in Adult Listeners |
title_full | Multimodal and Spectral Degradation Effects on Speech and Emotion
Recognition in Adult Listeners |
title_fullStr | Multimodal and Spectral Degradation Effects on Speech and Emotion
Recognition in Adult Listeners |
title_full_unstemmed | Multimodal and Spectral Degradation Effects on Speech and Emotion
Recognition in Adult Listeners |
title_short | Multimodal and Spectral Degradation Effects on Speech and Emotion
Recognition in Adult Listeners |
title_sort | multimodal and spectral degradation effects on speech and emotion
recognition in adult listeners |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6236866/ https://www.ncbi.nlm.nih.gov/pubmed/30378469 http://dx.doi.org/10.1177/2331216518804966 |
work_keys_str_mv | AT ritterchantel multimodalandspectraldegradationeffectsonspeechandemotionrecognitioninadultlisteners AT vongpaisaltara multimodalandspectraldegradationeffectsonspeechandemotionrecognitioninadultlisteners |