Cargando…
Clustering by Errors: A Self-Organized Multitask Learning Method for Acoustic Scene Classification
Acoustic scene classification (ASC) tries to inference information about the environment using audio segments. The inter-class similarity is a significant issue in ASC as acoustic scenes with different labels may sound quite similar. In this paper, the similarity relations amongst scenes are correla...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8747283/ https://www.ncbi.nlm.nih.gov/pubmed/35009578 http://dx.doi.org/10.3390/s22010036 |
_version_ | 1784630797037207552 |
---|---|
author | Zheng, Weiping Mo, Zhenyao Zhao, Gansen |
author_facet | Zheng, Weiping Mo, Zhenyao Zhao, Gansen |
author_sort | Zheng, Weiping |
collection | PubMed |
description | Acoustic scene classification (ASC) tries to inference information about the environment using audio segments. The inter-class similarity is a significant issue in ASC as acoustic scenes with different labels may sound quite similar. In this paper, the similarity relations amongst scenes are correlated with the classification error. A class hierarchy construction method by using classification error is then proposed and integrated into a multitask learning framework. The experiments have shown that the proposed multitask learning method improves the performance of ASC. On the TUT Acoustic Scene 2017 dataset, we obtain the ensemble fine-grained accuracy of 81.4%, which is better than the state-of-the-art. By using multitask learning, the basic Convolutional Neural Network (CNN) model can be improved by about 2.0 to 3.5 percent according to different spectrograms. The coarse category accuracies (for two to six super-classes) range from 77.0% to 96.2% by single models. On the revised version of the LITIS Rouen dataset, we achieve the ensemble fine-grained accuracy of 83.9%. The multitask learning models obtain an improvement of 1.6% to 1.8% compared to their basic models. The coarse category accuracies range from 94.9% to 97.9% for two to six super-classes with single models. |
format | Online Article Text |
id | pubmed-8747283 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-87472832022-01-11 Clustering by Errors: A Self-Organized Multitask Learning Method for Acoustic Scene Classification Zheng, Weiping Mo, Zhenyao Zhao, Gansen Sensors (Basel) Article Acoustic scene classification (ASC) tries to inference information about the environment using audio segments. The inter-class similarity is a significant issue in ASC as acoustic scenes with different labels may sound quite similar. In this paper, the similarity relations amongst scenes are correlated with the classification error. A class hierarchy construction method by using classification error is then proposed and integrated into a multitask learning framework. The experiments have shown that the proposed multitask learning method improves the performance of ASC. On the TUT Acoustic Scene 2017 dataset, we obtain the ensemble fine-grained accuracy of 81.4%, which is better than the state-of-the-art. By using multitask learning, the basic Convolutional Neural Network (CNN) model can be improved by about 2.0 to 3.5 percent according to different spectrograms. The coarse category accuracies (for two to six super-classes) range from 77.0% to 96.2% by single models. On the revised version of the LITIS Rouen dataset, we achieve the ensemble fine-grained accuracy of 83.9%. The multitask learning models obtain an improvement of 1.6% to 1.8% compared to their basic models. The coarse category accuracies range from 94.9% to 97.9% for two to six super-classes with single models. MDPI 2021-12-22 /pmc/articles/PMC8747283/ /pubmed/35009578 http://dx.doi.org/10.3390/s22010036 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Zheng, Weiping Mo, Zhenyao Zhao, Gansen Clustering by Errors: A Self-Organized Multitask Learning Method for Acoustic Scene Classification |
title | Clustering by Errors: A Self-Organized Multitask Learning Method for Acoustic Scene Classification |
title_full | Clustering by Errors: A Self-Organized Multitask Learning Method for Acoustic Scene Classification |
title_fullStr | Clustering by Errors: A Self-Organized Multitask Learning Method for Acoustic Scene Classification |
title_full_unstemmed | Clustering by Errors: A Self-Organized Multitask Learning Method for Acoustic Scene Classification |
title_short | Clustering by Errors: A Self-Organized Multitask Learning Method for Acoustic Scene Classification |
title_sort | clustering by errors: a self-organized multitask learning method for acoustic scene classification |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8747283/ https://www.ncbi.nlm.nih.gov/pubmed/35009578 http://dx.doi.org/10.3390/s22010036 |
work_keys_str_mv | AT zhengweiping clusteringbyerrorsaselforganizedmultitasklearningmethodforacousticsceneclassification AT mozhenyao clusteringbyerrorsaselforganizedmultitasklearningmethodforacousticsceneclassification AT zhaogansen clusteringbyerrorsaselforganizedmultitasklearningmethodforacousticsceneclassification |