Cargando…
A Binaural Grouping Model for Predicting Speech Intelligibility in Multitalker Environments
Spatially separating speech maskers from target speech often leads to a large intelligibility improvement. Modeling this phenomenon has long been of interest to binaural-hearing researchers for uncovering brain mechanisms and for improving signal-processing algorithms in hearing-assistive devices. M...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
SAGE Publications
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5051670/ https://www.ncbi.nlm.nih.gov/pubmed/27698261 http://dx.doi.org/10.1177/2331216516669919 |
_version_ | 1782458121388556288 |
---|---|
author | Mi, Jing Colburn, H. Steven |
author_facet | Mi, Jing Colburn, H. Steven |
author_sort | Mi, Jing |
collection | PubMed |
description | Spatially separating speech maskers from target speech often leads to a large intelligibility improvement. Modeling this phenomenon has long been of interest to binaural-hearing researchers for uncovering brain mechanisms and for improving signal-processing algorithms in hearing-assistive devices. Much of the previous binaural modeling work focused on the unmasking enabled by binaural cues at the periphery, and little quantitative modeling has been directed toward the grouping or source-separation benefits of binaural processing. In this article, we propose a binaural model that focuses on grouping, specifically on the selection of time-frequency units that are dominated by signals from the direction of the target. The proposed model uses Equalization-Cancellation (EC) processing with a binary decision rule to estimate a time-frequency binary mask. EC processing is carried out to cancel the target signal and the energy change between the EC input and output is used as a feature that reflects target dominance in each time-frequency unit. The processing in the proposed model requires little computational resources and is straightforward to implement. In combination with the Coherence-based Speech Intelligibility Index, the model is applied to predict the speech intelligibility data measured by Marrone et al. The predicted speech reception threshold matches the pattern of the measured data well, even though the predicted intelligibility improvements relative to the colocated condition are larger than some of the measured data, which may reflect the lack of internal noise in this initial version of the model. |
format | Online Article Text |
id | pubmed-5051670 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | SAGE Publications |
record_format | MEDLINE/PubMed |
spelling | pubmed-50516702016-10-18 A Binaural Grouping Model for Predicting Speech Intelligibility in Multitalker Environments Mi, Jing Colburn, H. Steven Trends Hear Original Article Spatially separating speech maskers from target speech often leads to a large intelligibility improvement. Modeling this phenomenon has long been of interest to binaural-hearing researchers for uncovering brain mechanisms and for improving signal-processing algorithms in hearing-assistive devices. Much of the previous binaural modeling work focused on the unmasking enabled by binaural cues at the periphery, and little quantitative modeling has been directed toward the grouping or source-separation benefits of binaural processing. In this article, we propose a binaural model that focuses on grouping, specifically on the selection of time-frequency units that are dominated by signals from the direction of the target. The proposed model uses Equalization-Cancellation (EC) processing with a binary decision rule to estimate a time-frequency binary mask. EC processing is carried out to cancel the target signal and the energy change between the EC input and output is used as a feature that reflects target dominance in each time-frequency unit. The processing in the proposed model requires little computational resources and is straightforward to implement. In combination with the Coherence-based Speech Intelligibility Index, the model is applied to predict the speech intelligibility data measured by Marrone et al. The predicted speech reception threshold matches the pattern of the measured data well, even though the predicted intelligibility improvements relative to the colocated condition are larger than some of the measured data, which may reflect the lack of internal noise in this initial version of the model. SAGE Publications 2016-10-03 /pmc/articles/PMC5051670/ /pubmed/27698261 http://dx.doi.org/10.1177/2331216516669919 Text en © The Author(s) 2016 http://creativecommons.org/licenses/by-nc/3.0/ This article is distributed under the terms of the Creative Commons Attribution-NonCommercial 3.0 License (http://www.creativecommons.org/licenses/by-nc/3.0/) which permits non-commercial use, reproduction and distribution of the work without further permission provided the original work is attributed as specified on the SAGE and Open Access pages (https://us.sagepub.com/en-us/nam/open-access-at-sage). |
spellingShingle | Original Article Mi, Jing Colburn, H. Steven A Binaural Grouping Model for Predicting Speech Intelligibility in Multitalker Environments |
title | A Binaural Grouping Model for Predicting Speech Intelligibility in Multitalker Environments |
title_full | A Binaural Grouping Model for Predicting Speech Intelligibility in Multitalker Environments |
title_fullStr | A Binaural Grouping Model for Predicting Speech Intelligibility in Multitalker Environments |
title_full_unstemmed | A Binaural Grouping Model for Predicting Speech Intelligibility in Multitalker Environments |
title_short | A Binaural Grouping Model for Predicting Speech Intelligibility in Multitalker Environments |
title_sort | binaural grouping model for predicting speech intelligibility in multitalker environments |
topic | Original Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5051670/ https://www.ncbi.nlm.nih.gov/pubmed/27698261 http://dx.doi.org/10.1177/2331216516669919 |
work_keys_str_mv | AT mijing abinauralgroupingmodelforpredictingspeechintelligibilityinmultitalkerenvironments AT colburnhsteven abinauralgroupingmodelforpredictingspeechintelligibilityinmultitalkerenvironments AT mijing binauralgroupingmodelforpredictingspeechintelligibilityinmultitalkerenvironments AT colburnhsteven binauralgroupingmodelforpredictingspeechintelligibilityinmultitalkerenvironments |