Cargando…

Audio object classification using distributed beliefs and attention

One of the unique characteristics of human hearing is its ability to recognize acoustic objects even in presence of severe noise and distortions. In this work, we explore two mechanisms underlying this ability: 1) redundant mapping of acoustic waveforms along distributed latent representations and 2...

Descripción completa

Detalles Bibliográficos
Autores principales: Bellur, Ashwin, Elhilali, Mounya
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7869589/
https://www.ncbi.nlm.nih.gov/pubmed/33564695
http://dx.doi.org/10.1109/taslp.2020.2966867
_version_ 1783648658795790336
author Bellur, Ashwin
Elhilali, Mounya
author_facet Bellur, Ashwin
Elhilali, Mounya
author_sort Bellur, Ashwin
collection PubMed
description One of the unique characteristics of human hearing is its ability to recognize acoustic objects even in presence of severe noise and distortions. In this work, we explore two mechanisms underlying this ability: 1) redundant mapping of acoustic waveforms along distributed latent representations and 2) adaptive feedback based on prior knowledge to selectively attend to targets of interest. We propose a bio-mimetic account of acoustic object classification by developing a novel distributed deep belief network validated for the task of robust acoustic object classification using the UrbanSound database. The proposed distributed belief network (DBN) encompasses an array of independent sub-networks trained generatively to capture different abstractions of natural sounds. A supervised classifier then performs a readout of this distributed mapping. The overall architecture not only matches the state of the art system for acoustic object classification but leads to significant improvement over the baseline in mismatched noisy conditions (31.4% relative improvement in 0dB conditions). Furthermore, we incorporate mechanisms of attentional feedback that allows the DBN to deploy local memories of sounds targets estimated at multiple views to bias network activation when attending to a particular object. This adaptive feedback results in further improvement of object classification in unseen noise conditions (relative improvement of 54% over the baseline in 0dB conditions).
format Online
Article
Text
id pubmed-7869589
institution National Center for Biotechnology Information
language English
publishDate 2020
record_format MEDLINE/PubMed
spelling pubmed-78695892021-02-08 Audio object classification using distributed beliefs and attention Bellur, Ashwin Elhilali, Mounya IEEE/ACM Trans Audio Speech Lang Process Article One of the unique characteristics of human hearing is its ability to recognize acoustic objects even in presence of severe noise and distortions. In this work, we explore two mechanisms underlying this ability: 1) redundant mapping of acoustic waveforms along distributed latent representations and 2) adaptive feedback based on prior knowledge to selectively attend to targets of interest. We propose a bio-mimetic account of acoustic object classification by developing a novel distributed deep belief network validated for the task of robust acoustic object classification using the UrbanSound database. The proposed distributed belief network (DBN) encompasses an array of independent sub-networks trained generatively to capture different abstractions of natural sounds. A supervised classifier then performs a readout of this distributed mapping. The overall architecture not only matches the state of the art system for acoustic object classification but leads to significant improvement over the baseline in mismatched noisy conditions (31.4% relative improvement in 0dB conditions). Furthermore, we incorporate mechanisms of attentional feedback that allows the DBN to deploy local memories of sounds targets estimated at multiple views to bias network activation when attending to a particular object. This adaptive feedback results in further improvement of object classification in unseen noise conditions (relative improvement of 54% over the baseline in 0dB conditions). 2020-01-15 2020 /pmc/articles/PMC7869589/ /pubmed/33564695 http://dx.doi.org/10.1109/taslp.2020.2966867 Text en This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
spellingShingle Article
Bellur, Ashwin
Elhilali, Mounya
Audio object classification using distributed beliefs and attention
title Audio object classification using distributed beliefs and attention
title_full Audio object classification using distributed beliefs and attention
title_fullStr Audio object classification using distributed beliefs and attention
title_full_unstemmed Audio object classification using distributed beliefs and attention
title_short Audio object classification using distributed beliefs and attention
title_sort audio object classification using distributed beliefs and attention
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7869589/
https://www.ncbi.nlm.nih.gov/pubmed/33564695
http://dx.doi.org/10.1109/taslp.2020.2966867
work_keys_str_mv AT bellurashwin audioobjectclassificationusingdistributedbeliefsandattention
AT elhilalimounya audioobjectclassificationusingdistributedbeliefsandattention