Cargando…

Spatial speech detection for binaural hearing aids using deep phoneme classifiers

Current hearing aids are limited with respect to speech-specific optimization for spatial sound sources to perform speech enhancement. In this study, we therefore propose an approach for spatial detection of speech based on sound source localization and blind optimization of speech enhancement for b...

Descripción completa

Detalles Bibliográficos
Autores principales:	Kayser, Hendrik, Hermansky, Hynek, Meyer, Bernd T.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	2022
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9502715/ https://www.ncbi.nlm.nih.gov/pubmed/36159631 http://dx.doi.org/10.1051/aacus/2022013

_version_	1784795774873239552
author	Kayser, Hendrik Hermansky, Hynek Meyer, Bernd T.
author_facet	Kayser, Hendrik Hermansky, Hynek Meyer, Bernd T.
author_sort	Kayser, Hendrik
collection	PubMed
description	Current hearing aids are limited with respect to speech-specific optimization for spatial sound sources to perform speech enhancement. In this study, we therefore propose an approach for spatial detection of speech based on sound source localization and blind optimization of speech enhancement for binaural hearing aids. We have combined an estimator for the direction of arrival (DOA), featuring high spatial resolution but no specialization to speech, with a measure of speech quality with low spatial resolution obtained after directional filtering. The DOA estimator provides spatial sound source probability in the frontal horizontal plane. The measure of speech quality is based on phoneme representations obtained from a deep neural network, which is part of a hybrid automatic speech recognition (ASR) system. Three ASR-based speech quality measures (ASQM) are explored: entropy, mean temporal distance (M-Measure), matched phoneme (MaP) filtering. We tested the approach in four acoustic scenes with one speaker and either a localized or a diffuse noise source at various signal-to-noise ratios (SNR) in anechoic or reverberant conditions. The effects of incorrect spatial filtering and noise were analyzed. We show that two of the three ASQMs (M-Measure, MaP filtering) are suited to reliably identify the speech target in different conditions. The system is not adapted to the environment and does not require a-priori information about the acoustic scene or a reference signal to estimate the quality of the enhanced speech signal. Nevertheless, our approach performs well in all acoustic scenes tested and varying SNRs and reliably detects incorrect spatial filtering angles.
format	Online Article Text
id	pubmed-9502715
institution	National Center for Biotechnology Information
language	English
publishDate	2022
record_format	MEDLINE/PubMed
spelling	pubmed-95027152022-09-23 Spatial speech detection for binaural hearing aids using deep phoneme classifiers Kayser, Hendrik Hermansky, Hynek Meyer, Bernd T. Acta Acust (2020) Article Current hearing aids are limited with respect to speech-specific optimization for spatial sound sources to perform speech enhancement. In this study, we therefore propose an approach for spatial detection of speech based on sound source localization and blind optimization of speech enhancement for binaural hearing aids. We have combined an estimator for the direction of arrival (DOA), featuring high spatial resolution but no specialization to speech, with a measure of speech quality with low spatial resolution obtained after directional filtering. The DOA estimator provides spatial sound source probability in the frontal horizontal plane. The measure of speech quality is based on phoneme representations obtained from a deep neural network, which is part of a hybrid automatic speech recognition (ASR) system. Three ASR-based speech quality measures (ASQM) are explored: entropy, mean temporal distance (M-Measure), matched phoneme (MaP) filtering. We tested the approach in four acoustic scenes with one speaker and either a localized or a diffuse noise source at various signal-to-noise ratios (SNR) in anechoic or reverberant conditions. The effects of incorrect spatial filtering and noise were analyzed. We show that two of the three ASQMs (M-Measure, MaP filtering) are suited to reliably identify the speech target in different conditions. The system is not adapted to the environment and does not require a-priori information about the acoustic scene or a reference signal to estimate the quality of the enhanced speech signal. Nevertheless, our approach performs well in all acoustic scenes tested and varying SNRs and reliably detects incorrect spatial filtering angles. 2022 2022-06-27 /pmc/articles/PMC9502715/ /pubmed/36159631 http://dx.doi.org/10.1051/aacus/2022013 Text en https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0 (https://creativecommons.org/licenses/by/4.0/) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Article Kayser, Hendrik Hermansky, Hynek Meyer, Bernd T. Spatial speech detection for binaural hearing aids using deep phoneme classifiers
title	Spatial speech detection for binaural hearing aids using deep phoneme classifiers
title_full	Spatial speech detection for binaural hearing aids using deep phoneme classifiers
title_fullStr	Spatial speech detection for binaural hearing aids using deep phoneme classifiers
title_full_unstemmed	Spatial speech detection for binaural hearing aids using deep phoneme classifiers
title_short	Spatial speech detection for binaural hearing aids using deep phoneme classifiers
title_sort	spatial speech detection for binaural hearing aids using deep phoneme classifiers
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9502715/ https://www.ncbi.nlm.nih.gov/pubmed/36159631 http://dx.doi.org/10.1051/aacus/2022013
work_keys_str_mv	AT kayserhendrik spatialspeechdetectionforbinauralhearingaidsusingdeepphonemeclassifiers AT hermanskyhynek spatialspeechdetectionforbinauralhearingaidsusingdeepphonemeclassifiers AT meyerberndt spatialspeechdetectionforbinauralhearingaidsusingdeepphonemeclassifiers

Spatial speech detection for binaural hearing aids using deep phoneme classifiers

Ejemplares similares