Cargando…

Particle Swarm Optimization Based Feature Enhancement and Feature Selection for Improved Emotion Recognition in Speech and Glottal Signals

In the recent years, many research works have been published using speech related features for speech emotion recognition, however, recent studies show that there is a strong correlation between emotional states and glottal features. In this work, Mel-frequency cepstralcoefficients (MFCCs), linear p...

Descripción completa

Detalles Bibliográficos
Autores principales: Muthusamy, Hariharan, Polat, Kemal, Yaacob, Sazali
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4370637/
https://www.ncbi.nlm.nih.gov/pubmed/25799141
http://dx.doi.org/10.1371/journal.pone.0120344
_version_ 1782362905863258112
author Muthusamy, Hariharan
Polat, Kemal
Yaacob, Sazali
author_facet Muthusamy, Hariharan
Polat, Kemal
Yaacob, Sazali
author_sort Muthusamy, Hariharan
collection PubMed
description In the recent years, many research works have been published using speech related features for speech emotion recognition, however, recent studies show that there is a strong correlation between emotional states and glottal features. In this work, Mel-frequency cepstralcoefficients (MFCCs), linear predictive cepstral coefficients (LPCCs), perceptual linear predictive (PLP) features, gammatone filter outputs, timbral texture features, stationary wavelet transform based timbral texture features and relative wavelet packet energy and entropy features were extracted from the emotional speech (ES) signals and its glottal waveforms(GW). Particle swarm optimization based clustering (PSOC) and wrapper based particle swarm optimization (WPSO) were proposed to enhance the discerning ability of the features and to select the discriminating features respectively. Three different emotional speech databases were utilized to gauge the proposed method. Extreme learning machine (ELM) was employed to classify the different types of emotions. Different experiments were conducted and the results show that the proposed method significantly improves the speech emotion recognition performance compared to previous works published in the literature.
format Online
Article
Text
id pubmed-4370637
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-43706372015-04-04 Particle Swarm Optimization Based Feature Enhancement and Feature Selection for Improved Emotion Recognition in Speech and Glottal Signals Muthusamy, Hariharan Polat, Kemal Yaacob, Sazali PLoS One Research Article In the recent years, many research works have been published using speech related features for speech emotion recognition, however, recent studies show that there is a strong correlation between emotional states and glottal features. In this work, Mel-frequency cepstralcoefficients (MFCCs), linear predictive cepstral coefficients (LPCCs), perceptual linear predictive (PLP) features, gammatone filter outputs, timbral texture features, stationary wavelet transform based timbral texture features and relative wavelet packet energy and entropy features were extracted from the emotional speech (ES) signals and its glottal waveforms(GW). Particle swarm optimization based clustering (PSOC) and wrapper based particle swarm optimization (WPSO) were proposed to enhance the discerning ability of the features and to select the discriminating features respectively. Three different emotional speech databases were utilized to gauge the proposed method. Extreme learning machine (ELM) was employed to classify the different types of emotions. Different experiments were conducted and the results show that the proposed method significantly improves the speech emotion recognition performance compared to previous works published in the literature. Public Library of Science 2015-03-23 /pmc/articles/PMC4370637/ /pubmed/25799141 http://dx.doi.org/10.1371/journal.pone.0120344 Text en © 2015 Muthusamy et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Muthusamy, Hariharan
Polat, Kemal
Yaacob, Sazali
Particle Swarm Optimization Based Feature Enhancement and Feature Selection for Improved Emotion Recognition in Speech and Glottal Signals
title Particle Swarm Optimization Based Feature Enhancement and Feature Selection for Improved Emotion Recognition in Speech and Glottal Signals
title_full Particle Swarm Optimization Based Feature Enhancement and Feature Selection for Improved Emotion Recognition in Speech and Glottal Signals
title_fullStr Particle Swarm Optimization Based Feature Enhancement and Feature Selection for Improved Emotion Recognition in Speech and Glottal Signals
title_full_unstemmed Particle Swarm Optimization Based Feature Enhancement and Feature Selection for Improved Emotion Recognition in Speech and Glottal Signals
title_short Particle Swarm Optimization Based Feature Enhancement and Feature Selection for Improved Emotion Recognition in Speech and Glottal Signals
title_sort particle swarm optimization based feature enhancement and feature selection for improved emotion recognition in speech and glottal signals
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4370637/
https://www.ncbi.nlm.nih.gov/pubmed/25799141
http://dx.doi.org/10.1371/journal.pone.0120344
work_keys_str_mv AT muthusamyhariharan particleswarmoptimizationbasedfeatureenhancementandfeatureselectionforimprovedemotionrecognitioninspeechandglottalsignals
AT polatkemal particleswarmoptimizationbasedfeatureenhancementandfeatureselectionforimprovedemotionrecognitioninspeechandglottalsignals
AT yaacobsazali particleswarmoptimizationbasedfeatureenhancementandfeatureselectionforimprovedemotionrecognitioninspeechandglottalsignals