Cargando…

An Incremental Class-Learning Approach with Acoustic Novelty Detection for Acoustic Event Recognition

Acoustic scene analysis (ASA) relies on the dynamic sensing and understanding of stationary and non-stationary sounds from various events, background noises and human actions with objects. However, the spatio-temporal nature of the sound signals may not be stationary, and novel events may exist that...

Descripción completa

Detalles Bibliográficos
Autores principales:	Bayram, Barış, İnce, Gökhan
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2021
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8512090/ https://www.ncbi.nlm.nih.gov/pubmed/34640943 http://dx.doi.org/10.3390/s21196622

_version_	1784582907003666432
author	Bayram, Barış İnce, Gökhan
author_facet	Bayram, Barış İnce, Gökhan
author_sort	Bayram, Barış
collection	PubMed
description	Acoustic scene analysis (ASA) relies on the dynamic sensing and understanding of stationary and non-stationary sounds from various events, background noises and human actions with objects. However, the spatio-temporal nature of the sound signals may not be stationary, and novel events may exist that eventually deteriorate the performance of the analysis. In this study, a self-learning-based ASA for acoustic event recognition (AER) is presented to detect and incrementally learn novel acoustic events by tackling catastrophic forgetting. The proposed ASA framework comprises six elements: (1) raw acoustic signal pre-processing, (2) low-level and deep audio feature extraction, (3) acoustic novelty detection (AND), (4) acoustic signal augmentations, (5) incremental class-learning (ICL) (of the audio features of the novel events) and (6) AER. The self-learning on different types of audio features extracted from the acoustic signals of various events occurs without human supervision. For the extraction of deep audio representations, in addition to visual geometry group (VGG) and residual neural network (ResNet), time-delay neural network (TDNN) and TDNN based long short-term memory (TDNN–LSTM) networks are pre-trained using a large-scale audio dataset, Google AudioSet. The performances of ICL with AND using Mel-spectrograms, and deep features with TDNNs, VGG, and ResNet from the Mel-spectrograms are validated on benchmark audio datasets such as ESC-10, ESC-50, UrbanSound8K (US8K), and an audio dataset collected by the authors in a real domestic environment.
format	Online Article Text
id	pubmed-8512090
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-85120902021-10-14 An Incremental Class-Learning Approach with Acoustic Novelty Detection for Acoustic Event Recognition Bayram, Barış İnce, Gökhan Sensors (Basel) Article Acoustic scene analysis (ASA) relies on the dynamic sensing and understanding of stationary and non-stationary sounds from various events, background noises and human actions with objects. However, the spatio-temporal nature of the sound signals may not be stationary, and novel events may exist that eventually deteriorate the performance of the analysis. In this study, a self-learning-based ASA for acoustic event recognition (AER) is presented to detect and incrementally learn novel acoustic events by tackling catastrophic forgetting. The proposed ASA framework comprises six elements: (1) raw acoustic signal pre-processing, (2) low-level and deep audio feature extraction, (3) acoustic novelty detection (AND), (4) acoustic signal augmentations, (5) incremental class-learning (ICL) (of the audio features of the novel events) and (6) AER. The self-learning on different types of audio features extracted from the acoustic signals of various events occurs without human supervision. For the extraction of deep audio representations, in addition to visual geometry group (VGG) and residual neural network (ResNet), time-delay neural network (TDNN) and TDNN based long short-term memory (TDNN–LSTM) networks are pre-trained using a large-scale audio dataset, Google AudioSet. The performances of ICL with AND using Mel-spectrograms, and deep features with TDNNs, VGG, and ResNet from the Mel-spectrograms are validated on benchmark audio datasets such as ESC-10, ESC-50, UrbanSound8K (US8K), and an audio dataset collected by the authors in a real domestic environment. MDPI 2021-10-05 /pmc/articles/PMC8512090/ /pubmed/34640943 http://dx.doi.org/10.3390/s21196622 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Bayram, Barış İnce, Gökhan An Incremental Class-Learning Approach with Acoustic Novelty Detection for Acoustic Event Recognition
title	An Incremental Class-Learning Approach with Acoustic Novelty Detection for Acoustic Event Recognition
title_full	An Incremental Class-Learning Approach with Acoustic Novelty Detection for Acoustic Event Recognition
title_fullStr	An Incremental Class-Learning Approach with Acoustic Novelty Detection for Acoustic Event Recognition
title_full_unstemmed	An Incremental Class-Learning Approach with Acoustic Novelty Detection for Acoustic Event Recognition
title_short	An Incremental Class-Learning Approach with Acoustic Novelty Detection for Acoustic Event Recognition
title_sort	incremental class-learning approach with acoustic novelty detection for acoustic event recognition
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8512090/ https://www.ncbi.nlm.nih.gov/pubmed/34640943 http://dx.doi.org/10.3390/s21196622
work_keys_str_mv	AT bayrambarıs anincrementalclasslearningapproachwithacousticnoveltydetectionforacousticeventrecognition AT incegokhan anincrementalclasslearningapproachwithacousticnoveltydetectionforacousticeventrecognition AT bayrambarıs incrementalclasslearningapproachwithacousticnoveltydetectionforacousticeventrecognition AT incegokhan incrementalclasslearningapproachwithacousticnoveltydetectionforacousticeventrecognition

An Incremental Class-Learning Approach with Acoustic Novelty Detection for Acoustic Event Recognition

Ejemplares similares