Cargando…

Bird sound spectrogram decomposition through Non-Negative Matrix Factorization for the acoustic classification of bird species

Feature extraction for Acoustic Bird Species Classification (ABSC) tasks has traditionally been based on parametric representations that were specifically developed for speech signals, such as Mel Frequency Cepstral Coefficients (MFCC). However, the discrimination capabilities of these features for...

Descripción completa

Detalles Bibliográficos
Autores principales: Ludeña-Choez, Jimmy, Quispe-Soncco, Raisa, Gallardo-Antolín, Ascensión
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5476267/
https://www.ncbi.nlm.nih.gov/pubmed/28628630
http://dx.doi.org/10.1371/journal.pone.0179403
_version_ 1783244580669358080
author Ludeña-Choez, Jimmy
Quispe-Soncco, Raisa
Gallardo-Antolín, Ascensión
author_facet Ludeña-Choez, Jimmy
Quispe-Soncco, Raisa
Gallardo-Antolín, Ascensión
author_sort Ludeña-Choez, Jimmy
collection PubMed
description Feature extraction for Acoustic Bird Species Classification (ABSC) tasks has traditionally been based on parametric representations that were specifically developed for speech signals, such as Mel Frequency Cepstral Coefficients (MFCC). However, the discrimination capabilities of these features for ABSC could be enhanced by accounting for the vocal production mechanisms of birds, and, in particular, the spectro-temporal structure of bird sounds. In this paper, a new front-end for ABSC is proposed that incorporates this specific information through the non-negative decomposition of bird sound spectrograms. It consists of the following two different stages: short-time feature extraction and temporal feature integration. In the first stage, which aims at providing a better spectral representation of bird sounds on a frame-by-frame basis, two methods are evaluated. In the first method, cepstral-like features (NMF_CC) are extracted by using a filter bank that is automatically learned by means of the application of Non-Negative Matrix Factorization (NMF) on bird audio spectrograms. In the second method, the features are directly derived from the activation coefficients of the spectrogram decomposition as performed through NMF (H_CC). The second stage summarizes the most relevant information contained in the short-time features by computing several statistical measures over long segments. The experiments show that the use of NMF_CC and H_CC in conjunction with temporal integration significantly improves the performance of a Support Vector Machine (SVM)-based ABSC system with respect to conventional MFCC.
format Online
Article
Text
id pubmed-5476267
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-54762672017-07-03 Bird sound spectrogram decomposition through Non-Negative Matrix Factorization for the acoustic classification of bird species Ludeña-Choez, Jimmy Quispe-Soncco, Raisa Gallardo-Antolín, Ascensión PLoS One Research Article Feature extraction for Acoustic Bird Species Classification (ABSC) tasks has traditionally been based on parametric representations that were specifically developed for speech signals, such as Mel Frequency Cepstral Coefficients (MFCC). However, the discrimination capabilities of these features for ABSC could be enhanced by accounting for the vocal production mechanisms of birds, and, in particular, the spectro-temporal structure of bird sounds. In this paper, a new front-end for ABSC is proposed that incorporates this specific information through the non-negative decomposition of bird sound spectrograms. It consists of the following two different stages: short-time feature extraction and temporal feature integration. In the first stage, which aims at providing a better spectral representation of bird sounds on a frame-by-frame basis, two methods are evaluated. In the first method, cepstral-like features (NMF_CC) are extracted by using a filter bank that is automatically learned by means of the application of Non-Negative Matrix Factorization (NMF) on bird audio spectrograms. In the second method, the features are directly derived from the activation coefficients of the spectrogram decomposition as performed through NMF (H_CC). The second stage summarizes the most relevant information contained in the short-time features by computing several statistical measures over long segments. The experiments show that the use of NMF_CC and H_CC in conjunction with temporal integration significantly improves the performance of a Support Vector Machine (SVM)-based ABSC system with respect to conventional MFCC. Public Library of Science 2017-06-19 /pmc/articles/PMC5476267/ /pubmed/28628630 http://dx.doi.org/10.1371/journal.pone.0179403 Text en © 2017 Ludeña-Choez et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Ludeña-Choez, Jimmy
Quispe-Soncco, Raisa
Gallardo-Antolín, Ascensión
Bird sound spectrogram decomposition through Non-Negative Matrix Factorization for the acoustic classification of bird species
title Bird sound spectrogram decomposition through Non-Negative Matrix Factorization for the acoustic classification of bird species
title_full Bird sound spectrogram decomposition through Non-Negative Matrix Factorization for the acoustic classification of bird species
title_fullStr Bird sound spectrogram decomposition through Non-Negative Matrix Factorization for the acoustic classification of bird species
title_full_unstemmed Bird sound spectrogram decomposition through Non-Negative Matrix Factorization for the acoustic classification of bird species
title_short Bird sound spectrogram decomposition through Non-Negative Matrix Factorization for the acoustic classification of bird species
title_sort bird sound spectrogram decomposition through non-negative matrix factorization for the acoustic classification of bird species
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5476267/
https://www.ncbi.nlm.nih.gov/pubmed/28628630
http://dx.doi.org/10.1371/journal.pone.0179403
work_keys_str_mv AT ludenachoezjimmy birdsoundspectrogramdecompositionthroughnonnegativematrixfactorizationfortheacousticclassificationofbirdspecies
AT quispesonccoraisa birdsoundspectrogramdecompositionthroughnonnegativematrixfactorizationfortheacousticclassificationofbirdspecies
AT gallardoantolinascension birdsoundspectrogramdecompositionthroughnonnegativematrixfactorizationfortheacousticclassificationofbirdspecies