Cargando…

A Binaural Grouping Model for Predicting Speech Intelligibility in Multitalker Environments

Spatially separating speech maskers from target speech often leads to a large intelligibility improvement. Modeling this phenomenon has long been of interest to binaural-hearing researchers for uncovering brain mechanisms and for improving signal-processing algorithms in hearing-assistive devices. M...

Descripción completa

Detalles Bibliográficos
Autores principales: Mi, Jing, Colburn, H. Steven
Formato: Online Artículo Texto
Lenguaje:English
Publicado: SAGE Publications 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5051670/
https://www.ncbi.nlm.nih.gov/pubmed/27698261
http://dx.doi.org/10.1177/2331216516669919
_version_ 1782458121388556288
author Mi, Jing
Colburn, H. Steven
author_facet Mi, Jing
Colburn, H. Steven
author_sort Mi, Jing
collection PubMed
description Spatially separating speech maskers from target speech often leads to a large intelligibility improvement. Modeling this phenomenon has long been of interest to binaural-hearing researchers for uncovering brain mechanisms and for improving signal-processing algorithms in hearing-assistive devices. Much of the previous binaural modeling work focused on the unmasking enabled by binaural cues at the periphery, and little quantitative modeling has been directed toward the grouping or source-separation benefits of binaural processing. In this article, we propose a binaural model that focuses on grouping, specifically on the selection of time-frequency units that are dominated by signals from the direction of the target. The proposed model uses Equalization-Cancellation (EC) processing with a binary decision rule to estimate a time-frequency binary mask. EC processing is carried out to cancel the target signal and the energy change between the EC input and output is used as a feature that reflects target dominance in each time-frequency unit. The processing in the proposed model requires little computational resources and is straightforward to implement. In combination with the Coherence-based Speech Intelligibility Index, the model is applied to predict the speech intelligibility data measured by Marrone et al. The predicted speech reception threshold matches the pattern of the measured data well, even though the predicted intelligibility improvements relative to the colocated condition are larger than some of the measured data, which may reflect the lack of internal noise in this initial version of the model.
format Online
Article
Text
id pubmed-5051670
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher SAGE Publications
record_format MEDLINE/PubMed
spelling pubmed-50516702016-10-18 A Binaural Grouping Model for Predicting Speech Intelligibility in Multitalker Environments Mi, Jing Colburn, H. Steven Trends Hear Original Article Spatially separating speech maskers from target speech often leads to a large intelligibility improvement. Modeling this phenomenon has long been of interest to binaural-hearing researchers for uncovering brain mechanisms and for improving signal-processing algorithms in hearing-assistive devices. Much of the previous binaural modeling work focused on the unmasking enabled by binaural cues at the periphery, and little quantitative modeling has been directed toward the grouping or source-separation benefits of binaural processing. In this article, we propose a binaural model that focuses on grouping, specifically on the selection of time-frequency units that are dominated by signals from the direction of the target. The proposed model uses Equalization-Cancellation (EC) processing with a binary decision rule to estimate a time-frequency binary mask. EC processing is carried out to cancel the target signal and the energy change between the EC input and output is used as a feature that reflects target dominance in each time-frequency unit. The processing in the proposed model requires little computational resources and is straightforward to implement. In combination with the Coherence-based Speech Intelligibility Index, the model is applied to predict the speech intelligibility data measured by Marrone et al. The predicted speech reception threshold matches the pattern of the measured data well, even though the predicted intelligibility improvements relative to the colocated condition are larger than some of the measured data, which may reflect the lack of internal noise in this initial version of the model. SAGE Publications 2016-10-03 /pmc/articles/PMC5051670/ /pubmed/27698261 http://dx.doi.org/10.1177/2331216516669919 Text en © The Author(s) 2016 http://creativecommons.org/licenses/by-nc/3.0/ This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 3.0 License (http://www.creativecommons.org/licenses/by-nc/3.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/open-access-at-sage).
spellingShingle Original Article
Mi, Jing
Colburn, H. Steven
A Binaural Grouping Model for Predicting Speech Intelligibility in Multitalker Environments
title A Binaural Grouping Model for Predicting Speech Intelligibility in Multitalker Environments
title_full A Binaural Grouping Model for Predicting Speech Intelligibility in Multitalker Environments
title_fullStr A Binaural Grouping Model for Predicting Speech Intelligibility in Multitalker Environments
title_full_unstemmed A Binaural Grouping Model for Predicting Speech Intelligibility in Multitalker Environments
title_short A Binaural Grouping Model for Predicting Speech Intelligibility in Multitalker Environments
title_sort binaural grouping model for predicting speech intelligibility in multitalker environments
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5051670/
https://www.ncbi.nlm.nih.gov/pubmed/27698261
http://dx.doi.org/10.1177/2331216516669919
work_keys_str_mv AT mijing abinauralgroupingmodelforpredictingspeechintelligibilityinmultitalkerenvironments
AT colburnhsteven abinauralgroupingmodelforpredictingspeechintelligibilityinmultitalkerenvironments
AT mijing binauralgroupingmodelforpredictingspeechintelligibilityinmultitalkerenvironments
AT colburnhsteven binauralgroupingmodelforpredictingspeechintelligibilityinmultitalkerenvironments