Cargando…

Robust Audio Content Classification Using Hybrid-Based SMD and Entropy-Based VAD

A robust approach for the application of audio content classification (ACC) is proposed in this paper, especially in variable noise-level conditions. We know that speech, music, and background noise (also called silence) are usually mixed in the noisy audio signal. Based on the findings, we propose...

Descripción completa

Detalles Bibliográficos
Autor principal:	Wang, Kun-Ching
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2020
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7516611/ https://www.ncbi.nlm.nih.gov/pubmed/33285958 http://dx.doi.org/10.3390/e22020183

_version_	1783587041143947264
author	Wang, Kun-Ching
author_facet	Wang, Kun-Ching
author_sort	Wang, Kun-Ching
collection	PubMed
description	A robust approach for the application of audio content classification (ACC) is proposed in this paper, especially in variable noise-level conditions. We know that speech, music, and background noise (also called silence) are usually mixed in the noisy audio signal. Based on the findings, we propose a hierarchical ACC approach consisting of three parts: voice activity detection (VAD), speech/music discrimination (SMD), and post-processing. First, entropy-based VAD is successfully used to segment input signal into noisy audio and noise even if variable-noise level is happening. The determinations of one-dimensional (1D)-subband energy information (1D-SEI) and 2D-textural image information (2D-TII) are then formed as a hybrid feature set. The hybrid-based SMD is achieved because the hybrid feature set is input into the classification of the support vector machine (SVM). Finally, a rule-based post-processing of segments is utilized to smoothly determine the output of the ACC system. The noisy audio is successfully classified into noise, speech, and music. Experimental results show that the hierarchical ACC system using hybrid feature-based SMD and entropy-based VAD is successfully evaluated against three available datasets and is comparable with existing methods even in a variable noise-level environment. In addition, our test results with the VAD scheme and hybrid features also shows that the proposed architecture increases the performance of audio content discrimination.
format	Online Article Text
id	pubmed-7516611
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-75166112020-11-09 Robust Audio Content Classification Using Hybrid-Based SMD and Entropy-Based VAD Wang, Kun-Ching Entropy (Basel) Article A robust approach for the application of audio content classification (ACC) is proposed in this paper, especially in variable noise-level conditions. We know that speech, music, and background noise (also called silence) are usually mixed in the noisy audio signal. Based on the findings, we propose a hierarchical ACC approach consisting of three parts: voice activity detection (VAD), speech/music discrimination (SMD), and post-processing. First, entropy-based VAD is successfully used to segment input signal into noisy audio and noise even if variable-noise level is happening. The determinations of one-dimensional (1D)-subband energy information (1D-SEI) and 2D-textural image information (2D-TII) are then formed as a hybrid feature set. The hybrid-based SMD is achieved because the hybrid feature set is input into the classification of the support vector machine (SVM). Finally, a rule-based post-processing of segments is utilized to smoothly determine the output of the ACC system. The noisy audio is successfully classified into noise, speech, and music. Experimental results show that the hierarchical ACC system using hybrid feature-based SMD and entropy-based VAD is successfully evaluated against three available datasets and is comparable with existing methods even in a variable noise-level environment. In addition, our test results with the VAD scheme and hybrid features also shows that the proposed architecture increases the performance of audio content discrimination. MDPI 2020-02-06 /pmc/articles/PMC7516611/ /pubmed/33285958 http://dx.doi.org/10.3390/e22020183 Text en © 2020 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Wang, Kun-Ching Robust Audio Content Classification Using Hybrid-Based SMD and Entropy-Based VAD
title	Robust Audio Content Classification Using Hybrid-Based SMD and Entropy-Based VAD
title_full	Robust Audio Content Classification Using Hybrid-Based SMD and Entropy-Based VAD
title_fullStr	Robust Audio Content Classification Using Hybrid-Based SMD and Entropy-Based VAD
title_full_unstemmed	Robust Audio Content Classification Using Hybrid-Based SMD and Entropy-Based VAD
title_short	Robust Audio Content Classification Using Hybrid-Based SMD and Entropy-Based VAD
title_sort	robust audio content classification using hybrid-based smd and entropy-based vad
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7516611/ https://www.ncbi.nlm.nih.gov/pubmed/33285958 http://dx.doi.org/10.3390/e22020183
work_keys_str_mv	AT wangkunching robustaudiocontentclassificationusinghybridbasedsmdandentropybasedvad

Robust Audio Content Classification Using Hybrid-Based SMD and Entropy-Based VAD

Ejemplares similares