Cargando…

Toward learning robust contrastive embeddings for binaural sound source localization

Recent deep neural network based methods provide accurate binaural source localization performance. These data-driven models map measured binaural cues directly to source locations hence their performance highly depend on the training data distribution. In this paper, we propose a parametric embeddi...

Descripción completa

Detalles Bibliográficos
Autores principales:	Tang, Duowei, Taseska, Maja, van Waterschoot, Toon
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Frontiers Media S.A. 2022
Materias:	Neuroscience
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9709308/ https://www.ncbi.nlm.nih.gov/pubmed/36465690 http://dx.doi.org/10.3389/fninf.2022.942978

_version_	1784841122664677376
author	Tang, Duowei Taseska, Maja van Waterschoot, Toon
author_facet	Tang, Duowei Taseska, Maja van Waterschoot, Toon
author_sort	Tang, Duowei
collection	PubMed
description	Recent deep neural network based methods provide accurate binaural source localization performance. These data-driven models map measured binaural cues directly to source locations hence their performance highly depend on the training data distribution. In this paper, we propose a parametric embedding that maps the binaural cues to a low-dimensional space where localization can be done with a nearest-neighbor regression. We implement the embedding using a neural network, optimized to map points that are close to each other in the latent space (the space of source azimuths or elevations) to nearby points in the embedding space, thus the Euclidean distances between the embeddings reflect their source proximities, and the structure of the embeddings forms a manifold, which provides interpretability to the embeddings. We show that the proposed embedding generalizes well in various acoustic conditions (with reverberation) different from those encountered during training, and provides better performance than unsupervised embeddings previously used for binaural localization. In addition, the proposed method performs better than or equally well as a feed-forward neural network based model that directly estimates the source locations from the binaural cues, and it has better results than the feed-forward model when a small amount of training data is used. Moreover, we also compare the proposed embedding using both supervised and weakly supervised learning, and show that in both conditions, the resulting embeddings perform similarly well, but the weakly supervised embedding allows to estimate source azimuth and elevation simultaneously.
format	Online Article Text
id	pubmed-9709308
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	Frontiers Media S.A.
record_format	MEDLINE/PubMed
spelling	pubmed-97093082022-12-01 Toward learning robust contrastive embeddings for binaural sound source localization Tang, Duowei Taseska, Maja van Waterschoot, Toon Front Neuroinform Neuroscience Recent deep neural network based methods provide accurate binaural source localization performance. These data-driven models map measured binaural cues directly to source locations hence their performance highly depend on the training data distribution. In this paper, we propose a parametric embedding that maps the binaural cues to a low-dimensional space where localization can be done with a nearest-neighbor regression. We implement the embedding using a neural network, optimized to map points that are close to each other in the latent space (the space of source azimuths or elevations) to nearby points in the embedding space, thus the Euclidean distances between the embeddings reflect their source proximities, and the structure of the embeddings forms a manifold, which provides interpretability to the embeddings. We show that the proposed embedding generalizes well in various acoustic conditions (with reverberation) different from those encountered during training, and provides better performance than unsupervised embeddings previously used for binaural localization. In addition, the proposed method performs better than or equally well as a feed-forward neural network based model that directly estimates the source locations from the binaural cues, and it has better results than the feed-forward model when a small amount of training data is used. Moreover, we also compare the proposed embedding using both supervised and weakly supervised learning, and show that in both conditions, the resulting embeddings perform similarly well, but the weakly supervised embedding allows to estimate source azimuth and elevation simultaneously. Frontiers Media S.A. 2022-11-16 /pmc/articles/PMC9709308/ /pubmed/36465690 http://dx.doi.org/10.3389/fninf.2022.942978 Text en Copyright © 2022 Tang, Taseska and van Waterschoot. https://creativecommons.org/licenses/by/4.0/This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle	Neuroscience Tang, Duowei Taseska, Maja van Waterschoot, Toon Toward learning robust contrastive embeddings for binaural sound source localization
title	Toward learning robust contrastive embeddings for binaural sound source localization
title_full	Toward learning robust contrastive embeddings for binaural sound source localization
title_fullStr	Toward learning robust contrastive embeddings for binaural sound source localization
title_full_unstemmed	Toward learning robust contrastive embeddings for binaural sound source localization
title_short	Toward learning robust contrastive embeddings for binaural sound source localization
title_sort	toward learning robust contrastive embeddings for binaural sound source localization
topic	Neuroscience
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9709308/ https://www.ncbi.nlm.nih.gov/pubmed/36465690 http://dx.doi.org/10.3389/fninf.2022.942978
work_keys_str_mv	AT tangduowei towardlearningrobustcontrastiveembeddingsforbinauralsoundsourcelocalization AT taseskamaja towardlearningrobustcontrastiveembeddingsforbinauralsoundsourcelocalization AT vanwaterschoottoon towardlearningrobustcontrastiveembeddingsforbinauralsoundsourcelocalization

Toward learning robust contrastive embeddings for binaural sound source localization

Ejemplares similares