Cargando…
Multisensory benefits for speech recognition in noisy environments
A series of our previous studies explored the use of an abstract visual representation of the amplitude envelope cues from target sentences to benefit speech perception in complex listening environments. The purpose of this study was to expand this auditory-visual speech perception to the tactile do...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9630463/ https://www.ncbi.nlm.nih.gov/pubmed/36340778 http://dx.doi.org/10.3389/fnins.2022.1031424 |
_version_ | 1784823607424188416 |
---|---|
author | Oh, Yonghee Schwalm, Meg Kalpin, Nicole |
author_facet | Oh, Yonghee Schwalm, Meg Kalpin, Nicole |
author_sort | Oh, Yonghee |
collection | PubMed |
description | A series of our previous studies explored the use of an abstract visual representation of the amplitude envelope cues from target sentences to benefit speech perception in complex listening environments. The purpose of this study was to expand this auditory-visual speech perception to the tactile domain. Twenty adults participated in speech recognition measurements in four different sensory modalities (AO, auditory-only; AV, auditory-visual; AT, auditory-tactile; AVT, auditory-visual-tactile). The target sentences were fixed at 65 dB sound pressure level and embedded within a simultaneous speech-shaped noise masker of varying degrees of signal-to-noise ratios (−7, −5, −3, −1, and 1 dB SNR). The amplitudes of both abstract visual and vibrotactile stimuli were temporally synchronized with the target speech envelope for comparison. Average results showed that adding temporally-synchronized multimodal cues to the auditory signal did provide significant improvements in word recognition performance across all three multimodal stimulus conditions (AV, AT, and AVT), especially at the lower SNR levels of −7, −5, and −3 dB for both male (8–20% improvement) and female (5–25% improvement) talkers. The greatest improvement in word recognition performance (15–19% improvement for males and 14–25% improvement for females) was observed when both visual and tactile cues were integrated (AVT). Another interesting finding in this study is that temporally synchronized abstract visual and vibrotactile stimuli additively stack in their influence on speech recognition performance. Our findings suggest that a multisensory integration process in speech perception requires salient temporal cues to enhance speech recognition ability in noisy environments. |
format | Online Article Text |
id | pubmed-9630463 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-96304632022-11-04 Multisensory benefits for speech recognition in noisy environments Oh, Yonghee Schwalm, Meg Kalpin, Nicole Front Neurosci Neuroscience A series of our previous studies explored the use of an abstract visual representation of the amplitude envelope cues from target sentences to benefit speech perception in complex listening environments. The purpose of this study was to expand this auditory-visual speech perception to the tactile domain. Twenty adults participated in speech recognition measurements in four different sensory modalities (AO, auditory-only; AV, auditory-visual; AT, auditory-tactile; AVT, auditory-visual-tactile). The target sentences were fixed at 65 dB sound pressure level and embedded within a simultaneous speech-shaped noise masker of varying degrees of signal-to-noise ratios (−7, −5, −3, −1, and 1 dB SNR). The amplitudes of both abstract visual and vibrotactile stimuli were temporally synchronized with the target speech envelope for comparison. Average results showed that adding temporally-synchronized multimodal cues to the auditory signal did provide significant improvements in word recognition performance across all three multimodal stimulus conditions (AV, AT, and AVT), especially at the lower SNR levels of −7, −5, and −3 dB for both male (8–20% improvement) and female (5–25% improvement) talkers. The greatest improvement in word recognition performance (15–19% improvement for males and 14–25% improvement for females) was observed when both visual and tactile cues were integrated (AVT). Another interesting finding in this study is that temporally synchronized abstract visual and vibrotactile stimuli additively stack in their influence on speech recognition performance. Our findings suggest that a multisensory integration process in speech perception requires salient temporal cues to enhance speech recognition ability in noisy environments. Frontiers Media S.A. 2022-10-20 /pmc/articles/PMC9630463/ /pubmed/36340778 http://dx.doi.org/10.3389/fnins.2022.1031424 Text en Copyright © 2022 Oh, Schwalm and Kalpin. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Neuroscience Oh, Yonghee Schwalm, Meg Kalpin, Nicole Multisensory benefits for speech recognition in noisy environments |
title | Multisensory benefits for speech recognition in noisy environments |
title_full | Multisensory benefits for speech recognition in noisy environments |
title_fullStr | Multisensory benefits for speech recognition in noisy environments |
title_full_unstemmed | Multisensory benefits for speech recognition in noisy environments |
title_short | Multisensory benefits for speech recognition in noisy environments |
title_sort | multisensory benefits for speech recognition in noisy environments |
topic | Neuroscience |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9630463/ https://www.ncbi.nlm.nih.gov/pubmed/36340778 http://dx.doi.org/10.3389/fnins.2022.1031424 |
work_keys_str_mv | AT ohyonghee multisensorybenefitsforspeechrecognitioninnoisyenvironments AT schwalmmeg multisensorybenefitsforspeechrecognitioninnoisyenvironments AT kalpinnicole multisensorybenefitsforspeechrecognitioninnoisyenvironments |