Cargando…

Impact of Feature Selection Algorithm on Speech Emotion Recognition Using Deep Convolutional Neural Network

Speech emotion recognition (SER) plays a significant role in human–machine interaction. Emotion recognition from speech and its precise classification is a challenging task because a machine is unable to understand its context. For an accurate emotion classification, emotionally relevant features mu...

Descripción completa

Detalles Bibliográficos
Autores principales:	Farooq, Misbah, Hussain, Fawad, Baloch, Naveed Khan, Raja, Fawad Riasat, Yu, Heejung, Zikria, Yousaf Bin
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2020
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7660211/ https://www.ncbi.nlm.nih.gov/pubmed/33113907 http://dx.doi.org/10.3390/s20216008

_version_	1783608963871277056
author	Farooq, Misbah Hussain, Fawad Baloch, Naveed Khan Raja, Fawad Riasat Yu, Heejung Zikria, Yousaf Bin
author_facet	Farooq, Misbah Hussain, Fawad Baloch, Naveed Khan Raja, Fawad Riasat Yu, Heejung Zikria, Yousaf Bin
author_sort	Farooq, Misbah
collection	PubMed
description	Speech emotion recognition (SER) plays a significant role in human–machine interaction. Emotion recognition from speech and its precise classification is a challenging task because a machine is unable to understand its context. For an accurate emotion classification, emotionally relevant features must be extracted from the speech data. Traditionally, handcrafted features were used for emotional classification from speech signals; however, they are not efficient enough to accurately depict the emotional states of the speaker. In this study, the benefits of a deep convolutional neural network (DCNN) for SER are explored. For this purpose, a pretrained network is used to extract features from state-of-the-art speech emotional datasets. Subsequently, a correlation-based feature selection technique is applied to the extracted features to select the most appropriate and discriminative features for SER. For the classification of emotions, we utilize support vector machines, random forests, the k-nearest neighbors algorithm, and neural network classifiers. Experiments are performed for speaker-dependent and speaker-independent SER using four publicly available datasets: the Berlin Dataset of Emotional Speech (Emo-DB), Surrey Audio Visual Expressed Emotion (SAVEE), Interactive Emotional Dyadic Motion Capture (IEMOCAP), and the Ryerson Audio Visual Dataset of Emotional Speech and Song (RAVDESS). Our proposed method achieves an accuracy of 95.10% for Emo-DB, 82.10% for SAVEE, 83.80% for IEMOCAP, and 81.30% for RAVDESS, for speaker-dependent SER experiments. Moreover, our method yields the best results for speaker-independent SER with existing handcrafted features-based SER approaches.
format	Online Article Text
id	pubmed-7660211
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-76602112020-11-13 Impact of Feature Selection Algorithm on Speech Emotion Recognition Using Deep Convolutional Neural Network Farooq, Misbah Hussain, Fawad Baloch, Naveed Khan Raja, Fawad Riasat Yu, Heejung Zikria, Yousaf Bin Sensors (Basel) Article Speech emotion recognition (SER) plays a significant role in human–machine interaction. Emotion recognition from speech and its precise classification is a challenging task because a machine is unable to understand its context. For an accurate emotion classification, emotionally relevant features must be extracted from the speech data. Traditionally, handcrafted features were used for emotional classification from speech signals; however, they are not efficient enough to accurately depict the emotional states of the speaker. In this study, the benefits of a deep convolutional neural network (DCNN) for SER are explored. For this purpose, a pretrained network is used to extract features from state-of-the-art speech emotional datasets. Subsequently, a correlation-based feature selection technique is applied to the extracted features to select the most appropriate and discriminative features for SER. For the classification of emotions, we utilize support vector machines, random forests, the k-nearest neighbors algorithm, and neural network classifiers. Experiments are performed for speaker-dependent and speaker-independent SER using four publicly available datasets: the Berlin Dataset of Emotional Speech (Emo-DB), Surrey Audio Visual Expressed Emotion (SAVEE), Interactive Emotional Dyadic Motion Capture (IEMOCAP), and the Ryerson Audio Visual Dataset of Emotional Speech and Song (RAVDESS). Our proposed method achieves an accuracy of 95.10% for Emo-DB, 82.10% for SAVEE, 83.80% for IEMOCAP, and 81.30% for RAVDESS, for speaker-dependent SER experiments. Moreover, our method yields the best results for speaker-independent SER with existing handcrafted features-based SER approaches. MDPI 2020-10-23 /pmc/articles/PMC7660211/ /pubmed/33113907 http://dx.doi.org/10.3390/s20216008 Text en © 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Farooq, Misbah Hussain, Fawad Baloch, Naveed Khan Raja, Fawad Riasat Yu, Heejung Zikria, Yousaf Bin Impact of Feature Selection Algorithm on Speech Emotion Recognition Using Deep Convolutional Neural Network
title	Impact of Feature Selection Algorithm on Speech Emotion Recognition Using Deep Convolutional Neural Network
title_full	Impact of Feature Selection Algorithm on Speech Emotion Recognition Using Deep Convolutional Neural Network
title_fullStr	Impact of Feature Selection Algorithm on Speech Emotion Recognition Using Deep Convolutional Neural Network
title_full_unstemmed	Impact of Feature Selection Algorithm on Speech Emotion Recognition Using Deep Convolutional Neural Network
title_short	Impact of Feature Selection Algorithm on Speech Emotion Recognition Using Deep Convolutional Neural Network
title_sort	impact of feature selection algorithm on speech emotion recognition using deep convolutional neural network
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7660211/ https://www.ncbi.nlm.nih.gov/pubmed/33113907 http://dx.doi.org/10.3390/s20216008
work_keys_str_mv	AT farooqmisbah impactoffeatureselectionalgorithmonspeechemotionrecognitionusingdeepconvolutionalneuralnetwork AT hussainfawad impactoffeatureselectionalgorithmonspeechemotionrecognitionusingdeepconvolutionalneuralnetwork AT balochnaveedkhan impactoffeatureselectionalgorithmonspeechemotionrecognitionusingdeepconvolutionalneuralnetwork AT rajafawadriasat impactoffeatureselectionalgorithmonspeechemotionrecognitionusingdeepconvolutionalneuralnetwork AT yuheejung impactoffeatureselectionalgorithmonspeechemotionrecognitionusingdeepconvolutionalneuralnetwork AT zikriayousafbin impactoffeatureselectionalgorithmonspeechemotionrecognitionusingdeepconvolutionalneuralnetwork

Impact of Feature Selection Algorithm on Speech Emotion Recognition Using Deep Convolutional Neural Network

Ejemplares similares