Cargando…

Multisensory benefits for speech recognition in noisy environments

A series of our previous studies explored the use of an abstract visual representation of the amplitude envelope cues from target sentences to benefit speech perception in complex listening environments. The purpose of this study was to expand this auditory-visual speech perception to the tactile do...

Descripción completa

Detalles Bibliográficos
Autores principales: Oh, Yonghee, Schwalm, Meg, Kalpin, Nicole
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9630463/
https://www.ncbi.nlm.nih.gov/pubmed/36340778
http://dx.doi.org/10.3389/fnins.2022.1031424
_version_ 1784823607424188416
author Oh, Yonghee
Schwalm, Meg
Kalpin, Nicole
author_facet Oh, Yonghee
Schwalm, Meg
Kalpin, Nicole
author_sort Oh, Yonghee
collection PubMed
description A series of our previous studies explored the use of an abstract visual representation of the amplitude envelope cues from target sentences to benefit speech perception in complex listening environments. The purpose of this study was to expand this auditory-visual speech perception to the tactile domain. Twenty adults participated in speech recognition measurements in four different sensory modalities (AO, auditory-only; AV, auditory-visual; AT, auditory-tactile; AVT, auditory-visual-tactile). The target sentences were fixed at 65 dB sound pressure level and embedded within a simultaneous speech-shaped noise masker of varying degrees of signal-to-noise ratios (−7, −5, −3, −1, and 1 dB SNR). The amplitudes of both abstract visual and vibrotactile stimuli were temporally synchronized with the target speech envelope for comparison. Average results showed that adding temporally-synchronized multimodal cues to the auditory signal did provide significant improvements in word recognition performance across all three multimodal stimulus conditions (AV, AT, and AVT), especially at the lower SNR levels of −7, −5, and −3 dB for both male (8–20% improvement) and female (5–25% improvement) talkers. The greatest improvement in word recognition performance (15–19% improvement for males and 14–25% improvement for females) was observed when both visual and tactile cues were integrated (AVT). Another interesting finding in this study is that temporally synchronized abstract visual and vibrotactile stimuli additively stack in their influence on speech recognition performance. Our findings suggest that a multisensory integration process in speech perception requires salient temporal cues to enhance speech recognition ability in noisy environments.
format Online
Article
Text
id pubmed-9630463
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-96304632022-11-04 Multisensory benefits for speech recognition in noisy environments Oh, Yonghee Schwalm, Meg Kalpin, Nicole Front Neurosci Neuroscience A series of our previous studies explored the use of an abstract visual representation of the amplitude envelope cues from target sentences to benefit speech perception in complex listening environments. The purpose of this study was to expand this auditory-visual speech perception to the tactile domain. Twenty adults participated in speech recognition measurements in four different sensory modalities (AO, auditory-only; AV, auditory-visual; AT, auditory-tactile; AVT, auditory-visual-tactile). The target sentences were fixed at 65 dB sound pressure level and embedded within a simultaneous speech-shaped noise masker of varying degrees of signal-to-noise ratios (−7, −5, −3, −1, and 1 dB SNR). The amplitudes of both abstract visual and vibrotactile stimuli were temporally synchronized with the target speech envelope for comparison. Average results showed that adding temporally-synchronized multimodal cues to the auditory signal did provide significant improvements in word recognition performance across all three multimodal stimulus conditions (AV, AT, and AVT), especially at the lower SNR levels of −7, −5, and −3 dB for both male (8–20% improvement) and female (5–25% improvement) talkers. The greatest improvement in word recognition performance (15–19% improvement for males and 14–25% improvement for females) was observed when both visual and tactile cues were integrated (AVT). Another interesting finding in this study is that temporally synchronized abstract visual and vibrotactile stimuli additively stack in their influence on speech recognition performance. Our findings suggest that a multisensory integration process in speech perception requires salient temporal cues to enhance speech recognition ability in noisy environments. Frontiers Media S.A. 2022-10-20 /pmc/articles/PMC9630463/ /pubmed/36340778 http://dx.doi.org/10.3389/fnins.2022.1031424 Text en Copyright © 2022 Oh, Schwalm and Kalpin. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Neuroscience
Oh, Yonghee
Schwalm, Meg
Kalpin, Nicole
Multisensory benefits for speech recognition in noisy environments
title Multisensory benefits for speech recognition in noisy environments
title_full Multisensory benefits for speech recognition in noisy environments
title_fullStr Multisensory benefits for speech recognition in noisy environments
title_full_unstemmed Multisensory benefits for speech recognition in noisy environments
title_short Multisensory benefits for speech recognition in noisy environments
title_sort multisensory benefits for speech recognition in noisy environments
topic Neuroscience
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9630463/
https://www.ncbi.nlm.nih.gov/pubmed/36340778
http://dx.doi.org/10.3389/fnins.2022.1031424
work_keys_str_mv AT ohyonghee multisensorybenefitsforspeechrecognitioninnoisyenvironments
AT schwalmmeg multisensorybenefitsforspeechrecognitioninnoisyenvironments
AT kalpinnicole multisensorybenefitsforspeechrecognitioninnoisyenvironments