Cargando…
Audio object classification using distributed beliefs and attention
One of the unique characteristics of human hearing is its ability to recognize acoustic objects even in presence of severe noise and distortions. In this work, we explore two mechanisms underlying this ability: 1) redundant mapping of acoustic waveforms along distributed latent representations and 2...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7869589/ https://www.ncbi.nlm.nih.gov/pubmed/33564695 http://dx.doi.org/10.1109/taslp.2020.2966867 |
_version_ | 1783648658795790336 |
---|---|
author | Bellur, Ashwin Elhilali, Mounya |
author_facet | Bellur, Ashwin Elhilali, Mounya |
author_sort | Bellur, Ashwin |
collection | PubMed |
description | One of the unique characteristics of human hearing is its ability to recognize acoustic objects even in presence of severe noise and distortions. In this work, we explore two mechanisms underlying this ability: 1) redundant mapping of acoustic waveforms along distributed latent representations and 2) adaptive feedback based on prior knowledge to selectively attend to targets of interest. We propose a bio-mimetic account of acoustic object classification by developing a novel distributed deep belief network validated for the task of robust acoustic object classification using the UrbanSound database. The proposed distributed belief network (DBN) encompasses an array of independent sub-networks trained generatively to capture different abstractions of natural sounds. A supervised classifier then performs a readout of this distributed mapping. The overall architecture not only matches the state of the art system for acoustic object classification but leads to significant improvement over the baseline in mismatched noisy conditions (31.4% relative improvement in 0dB conditions). Furthermore, we incorporate mechanisms of attentional feedback that allows the DBN to deploy local memories of sounds targets estimated at multiple views to bias network activation when attending to a particular object. This adaptive feedback results in further improvement of object classification in unseen noise conditions (relative improvement of 54% over the baseline in 0dB conditions). |
format | Online Article Text |
id | pubmed-7869589 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
record_format | MEDLINE/PubMed |
spelling | pubmed-78695892021-02-08 Audio object classification using distributed beliefs and attention Bellur, Ashwin Elhilali, Mounya IEEE/ACM Trans Audio Speech Lang Process Article One of the unique characteristics of human hearing is its ability to recognize acoustic objects even in presence of severe noise and distortions. In this work, we explore two mechanisms underlying this ability: 1) redundant mapping of acoustic waveforms along distributed latent representations and 2) adaptive feedback based on prior knowledge to selectively attend to targets of interest. We propose a bio-mimetic account of acoustic object classification by developing a novel distributed deep belief network validated for the task of robust acoustic object classification using the UrbanSound database. The proposed distributed belief network (DBN) encompasses an array of independent sub-networks trained generatively to capture different abstractions of natural sounds. A supervised classifier then performs a readout of this distributed mapping. The overall architecture not only matches the state of the art system for acoustic object classification but leads to significant improvement over the baseline in mismatched noisy conditions (31.4% relative improvement in 0dB conditions). Furthermore, we incorporate mechanisms of attentional feedback that allows the DBN to deploy local memories of sounds targets estimated at multiple views to bias network activation when attending to a particular object. This adaptive feedback results in further improvement of object classification in unseen noise conditions (relative improvement of 54% over the baseline in 0dB conditions). 2020-01-15 2020 /pmc/articles/PMC7869589/ /pubmed/33564695 http://dx.doi.org/10.1109/taslp.2020.2966867 Text en This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/. |
spellingShingle | Article Bellur, Ashwin Elhilali, Mounya Audio object classification using distributed beliefs and attention |
title | Audio object classification using distributed beliefs and attention |
title_full | Audio object classification using distributed beliefs and attention |
title_fullStr | Audio object classification using distributed beliefs and attention |
title_full_unstemmed | Audio object classification using distributed beliefs and attention |
title_short | Audio object classification using distributed beliefs and attention |
title_sort | audio object classification using distributed beliefs and attention |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7869589/ https://www.ncbi.nlm.nih.gov/pubmed/33564695 http://dx.doi.org/10.1109/taslp.2020.2966867 |
work_keys_str_mv | AT bellurashwin audioobjectclassificationusingdistributedbeliefsandattention AT elhilalimounya audioobjectclassificationusingdistributedbeliefsandattention |