Cargando…

Effect on speech emotion classification of a feature selection approach using a convolutional neural network

Speech emotion recognition (SER) is a challenging issue because it is not clear which features are effective for classification. Emotionally related features are always extracted from speech signals for emotional classification. Handcrafted features are mainly used for emotional identification from...

Descripción completa

Detalles Bibliográficos
Autores principales:	Amjad, Ammar, Khan, Lal, Chang, Hsien-Tsung
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	PeerJ Inc. 2021
Materias:	Artificial Intelligence
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8576551/ https://www.ncbi.nlm.nih.gov/pubmed/34805511 http://dx.doi.org/10.7717/peerj-cs.766

_version_	1784595899556560896
author	Amjad, Ammar Khan, Lal Chang, Hsien-Tsung
author_facet	Amjad, Ammar Khan, Lal Chang, Hsien-Tsung
author_sort	Amjad, Ammar
collection	PubMed
description	Speech emotion recognition (SER) is a challenging issue because it is not clear which features are effective for classification. Emotionally related features are always extracted from speech signals for emotional classification. Handcrafted features are mainly used for emotional identification from audio signals. However, these features are not sufficient to correctly identify the emotional state of the speaker. The advantages of a deep convolutional neural network (DCNN) are investigated in the proposed work. A pretrained framework is used to extract the features from speech emotion databases. In this work, we adopt the feature selection (FS) approach to find the discriminative and most important features for SER. Many algorithms are used for the emotion classification problem. We use the random forest (RF), decision tree (DT), support vector machine (SVM), multilayer perceptron classifier (MLP), and k-nearest neighbors (KNN) to classify seven emotions. All experiments are performed by utilizing four different publicly accessible databases. Our method obtains accuracies of 92.02%, 88.77%, 93.61%, and 77.23% for Emo-DB, SAVEE, RAVDESS, and IEMOCAP, respectively, for speaker-dependent (SD) recognition with the feature selection method. Furthermore, compared to current handcrafted feature-based SER methods, the proposed method shows the best results for speaker-independent SER. For EMO-DB, all classifiers attain an accuracy of more than 80% with or without the feature selection technique.
format	Online Article Text
id	pubmed-8576551
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	PeerJ Inc.
record_format	MEDLINE/PubMed
spelling	pubmed-85765512021-11-19 Effect on speech emotion classification of a feature selection approach using a convolutional neural network Amjad, Ammar Khan, Lal Chang, Hsien-Tsung PeerJ Comput Sci Artificial Intelligence Speech emotion recognition (SER) is a challenging issue because it is not clear which features are effective for classification. Emotionally related features are always extracted from speech signals for emotional classification. Handcrafted features are mainly used for emotional identification from audio signals. However, these features are not sufficient to correctly identify the emotional state of the speaker. The advantages of a deep convolutional neural network (DCNN) are investigated in the proposed work. A pretrained framework is used to extract the features from speech emotion databases. In this work, we adopt the feature selection (FS) approach to find the discriminative and most important features for SER. Many algorithms are used for the emotion classification problem. We use the random forest (RF), decision tree (DT), support vector machine (SVM), multilayer perceptron classifier (MLP), and k-nearest neighbors (KNN) to classify seven emotions. All experiments are performed by utilizing four different publicly accessible databases. Our method obtains accuracies of 92.02%, 88.77%, 93.61%, and 77.23% for Emo-DB, SAVEE, RAVDESS, and IEMOCAP, respectively, for speaker-dependent (SD) recognition with the feature selection method. Furthermore, compared to current handcrafted feature-based SER methods, the proposed method shows the best results for speaker-independent SER. For EMO-DB, all classifiers attain an accuracy of more than 80% with or without the feature selection technique. PeerJ Inc. 2021-11-03 /pmc/articles/PMC8576551/ /pubmed/34805511 http://dx.doi.org/10.7717/peerj-cs.766 Text en © 2021 Amjad et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle	Artificial Intelligence Amjad, Ammar Khan, Lal Chang, Hsien-Tsung Effect on speech emotion classification of a feature selection approach using a convolutional neural network
title	Effect on speech emotion classification of a feature selection approach using a convolutional neural network
title_full	Effect on speech emotion classification of a feature selection approach using a convolutional neural network
title_fullStr	Effect on speech emotion classification of a feature selection approach using a convolutional neural network
title_full_unstemmed	Effect on speech emotion classification of a feature selection approach using a convolutional neural network
title_short	Effect on speech emotion classification of a feature selection approach using a convolutional neural network
title_sort	effect on speech emotion classification of a feature selection approach using a convolutional neural network
topic	Artificial Intelligence
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8576551/ https://www.ncbi.nlm.nih.gov/pubmed/34805511 http://dx.doi.org/10.7717/peerj-cs.766
work_keys_str_mv	AT amjadammar effectonspeechemotionclassificationofafeatureselectionapproachusingaconvolutionalneuralnetwork AT khanlal effectonspeechemotionclassificationofafeatureselectionapproachusingaconvolutionalneuralnetwork AT changhsientsung effectonspeechemotionclassificationofafeatureselectionapproachusingaconvolutionalneuralnetwork

Effect on speech emotion classification of a feature selection approach using a convolutional neural network

Ejemplares similares