Cargando…

Open Set Audio Classification Using Autoencoders Trained on Few Data

Open-set recognition (OSR) is a challenging machine learning problem that appears when classifiers are faced with test instances from classes not seen during training. It can be summarized as the problem of correctly identifying instances from a known class (seen during training) while rejecting any...

Descripción completa

Detalles Bibliográficos
Autores principales:	Naranjo-Alcazar, Javier, Perez-Castanos, Sergi, Zuccarello, Pedro, Antonacci, Fabio, Cobos, Maximo
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2020
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7374438/ https://www.ncbi.nlm.nih.gov/pubmed/32635378 http://dx.doi.org/10.3390/s20133741

_version_	1783561699146596352
author	Naranjo-Alcazar, Javier Perez-Castanos, Sergi Zuccarello, Pedro Antonacci, Fabio Cobos, Maximo
author_facet	Naranjo-Alcazar, Javier Perez-Castanos, Sergi Zuccarello, Pedro Antonacci, Fabio Cobos, Maximo
author_sort	Naranjo-Alcazar, Javier
collection	PubMed
description	Open-set recognition (OSR) is a challenging machine learning problem that appears when classifiers are faced with test instances from classes not seen during training. It can be summarized as the problem of correctly identifying instances from a known class (seen during training) while rejecting any unknown or unwanted samples (those belonging to unseen classes). Another problem arising in practical scenarios is few-shot learning (FSL), which appears when there is no availability of a large number of positive samples for training a recognition system. Taking these two limitations into account, a new dataset for OSR and FSL for audio data was recently released to promote research on solutions aimed at addressing both limitations. This paper proposes an audio OSR/FSL system divided into three steps: a high-level audio representation, feature embedding using two different autoencoder architectures and a multi-layer perceptron (MLP) trained on latent space representations to detect known classes and reject unwanted ones. An extensive set of experiments is carried out considering multiple combinations of openness factors (OSR condition) and number of shots (FSL condition), showing the validity of the proposed approach and confirming superior performance with respect to a baseline system based on transfer learning.
format	Online Article Text
id	pubmed-7374438
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-73744382020-08-06 Open Set Audio Classification Using Autoencoders Trained on Few Data Naranjo-Alcazar, Javier Perez-Castanos, Sergi Zuccarello, Pedro Antonacci, Fabio Cobos, Maximo Sensors (Basel) Article Open-set recognition (OSR) is a challenging machine learning problem that appears when classifiers are faced with test instances from classes not seen during training. It can be summarized as the problem of correctly identifying instances from a known class (seen during training) while rejecting any unknown or unwanted samples (those belonging to unseen classes). Another problem arising in practical scenarios is few-shot learning (FSL), which appears when there is no availability of a large number of positive samples for training a recognition system. Taking these two limitations into account, a new dataset for OSR and FSL for audio data was recently released to promote research on solutions aimed at addressing both limitations. This paper proposes an audio OSR/FSL system divided into three steps: a high-level audio representation, feature embedding using two different autoencoder architectures and a multi-layer perceptron (MLP) trained on latent space representations to detect known classes and reject unwanted ones. An extensive set of experiments is carried out considering multiple combinations of openness factors (OSR condition) and number of shots (FSL condition), showing the validity of the proposed approach and confirming superior performance with respect to a baseline system based on transfer learning. MDPI 2020-07-03 /pmc/articles/PMC7374438/ /pubmed/32635378 http://dx.doi.org/10.3390/s20133741 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Naranjo-Alcazar, Javier Perez-Castanos, Sergi Zuccarello, Pedro Antonacci, Fabio Cobos, Maximo Open Set Audio Classification Using Autoencoders Trained on Few Data
title	Open Set Audio Classification Using Autoencoders Trained on Few Data
title_full	Open Set Audio Classification Using Autoencoders Trained on Few Data
title_fullStr	Open Set Audio Classification Using Autoencoders Trained on Few Data
title_full_unstemmed	Open Set Audio Classification Using Autoencoders Trained on Few Data
title_short	Open Set Audio Classification Using Autoencoders Trained on Few Data
title_sort	open set audio classification using autoencoders trained on few data
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7374438/ https://www.ncbi.nlm.nih.gov/pubmed/32635378 http://dx.doi.org/10.3390/s20133741
work_keys_str_mv	AT naranjoalcazarjavier opensetaudioclassificationusingautoencoderstrainedonfewdata AT perezcastanossergi opensetaudioclassificationusingautoencoderstrainedonfewdata AT zuccarellopedro opensetaudioclassificationusingautoencoderstrainedonfewdata AT antonaccifabio opensetaudioclassificationusingautoencoderstrainedonfewdata AT cobosmaximo opensetaudioclassificationusingautoencoderstrainedonfewdata

Open Set Audio Classification Using Autoencoders Trained on Few Data

Ejemplares similares