Cargando…

Stressed Speech Emotion Recognition Using Teager Energy and Spectral Feature Fusion with Feature Optimization

The objective of speech emotion recognition (SER) is to enhance man–machine interface. It can also be used to cover the physiological state of a person in critical situations. In recent time, speech emotion recognition also finds its operations in medicine and forensics. A new feature extraction tec...

Descripción completa

Detalles Bibliográficos
Autores principales: Bandela, Surekha Reddy, Siva Priyanka, S., Sunil Kumar, K., Vijay Bhaskar Reddy, Y., Berhanu, Afework Aemro
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10586421/
https://www.ncbi.nlm.nih.gov/pubmed/37868755
http://dx.doi.org/10.1155/2023/5765760
_version_ 1785123156175880192
author Bandela, Surekha Reddy
Siva Priyanka, S.
Sunil Kumar, K.
Vijay Bhaskar Reddy, Y.
Berhanu, Afework Aemro
author_facet Bandela, Surekha Reddy
Siva Priyanka, S.
Sunil Kumar, K.
Vijay Bhaskar Reddy, Y.
Berhanu, Afework Aemro
author_sort Bandela, Surekha Reddy
collection PubMed
description The objective of speech emotion recognition (SER) is to enhance man–machine interface. It can also be used to cover the physiological state of a person in critical situations. In recent time, speech emotion recognition also finds its operations in medicine and forensics. A new feature extraction technique using Teager energy operator (TEO) is proposed for the detection of stressed emotions as Teager energy-autocorrelation envelope (TEO-Auto-Env). TEO is basically designed for increasing the energies of the stressed speech signals whose energies are reduced during the speech production process and hence used in this analysis. A stressed speech emotion recognition (SSER) system is developed using TEO-Auto-Env and spectral feature combination for detecting the emotions. The spectral features considered are Mel-frequency cepstral coefficients (MFCC), linear prediction cepstral coefficients (LPCC), and relative spectra–perceptual linear prediction (RASTA-PLP). EMO-DB (German), EMOVO (Italian), IITKGP (Telugu), and EMA (English) databases are used in this analysis. The classification of the emotions is carried out using the k-nearest neighborhood (k-NN) classifier for gender-dependent (GD) and speaker-independent (SI) cases. The proposed SSER system provides improved accuracy compared to the existing ones. Average recall is used for performance evaluation. The highest classification accuracy is achieved using the feature combination of TEO-Auto-Env, MFCC, and LPCC features with 91.4% (SI), 91.4% (GD-male), and 93.1%(GD-female) for EMO-DB; 68.5% (SI), 68.5% (GD-male), and 74.6% (GD-female) for EMOVO; 90.6%(SI), 91% (GD-male), and 92.3% (GD-female) for EMA; and 95.1% (GD-female) for IITKGP female database.
format Online
Article
Text
id pubmed-10586421
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Hindawi
record_format MEDLINE/PubMed
spelling pubmed-105864212023-10-20 Stressed Speech Emotion Recognition Using Teager Energy and Spectral Feature Fusion with Feature Optimization Bandela, Surekha Reddy Siva Priyanka, S. Sunil Kumar, K. Vijay Bhaskar Reddy, Y. Berhanu, Afework Aemro Comput Intell Neurosci Research Article The objective of speech emotion recognition (SER) is to enhance man–machine interface. It can also be used to cover the physiological state of a person in critical situations. In recent time, speech emotion recognition also finds its operations in medicine and forensics. A new feature extraction technique using Teager energy operator (TEO) is proposed for the detection of stressed emotions as Teager energy-autocorrelation envelope (TEO-Auto-Env). TEO is basically designed for increasing the energies of the stressed speech signals whose energies are reduced during the speech production process and hence used in this analysis. A stressed speech emotion recognition (SSER) system is developed using TEO-Auto-Env and spectral feature combination for detecting the emotions. The spectral features considered are Mel-frequency cepstral coefficients (MFCC), linear prediction cepstral coefficients (LPCC), and relative spectra–perceptual linear prediction (RASTA-PLP). EMO-DB (German), EMOVO (Italian), IITKGP (Telugu), and EMA (English) databases are used in this analysis. The classification of the emotions is carried out using the k-nearest neighborhood (k-NN) classifier for gender-dependent (GD) and speaker-independent (SI) cases. The proposed SSER system provides improved accuracy compared to the existing ones. Average recall is used for performance evaluation. The highest classification accuracy is achieved using the feature combination of TEO-Auto-Env, MFCC, and LPCC features with 91.4% (SI), 91.4% (GD-male), and 93.1%(GD-female) for EMO-DB; 68.5% (SI), 68.5% (GD-male), and 74.6% (GD-female) for EMOVO; 90.6%(SI), 91% (GD-male), and 92.3% (GD-female) for EMA; and 95.1% (GD-female) for IITKGP female database. Hindawi 2023-10-11 /pmc/articles/PMC10586421/ /pubmed/37868755 http://dx.doi.org/10.1155/2023/5765760 Text en Copyright © 2023 Surekha Reddy Bandela et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Bandela, Surekha Reddy
Siva Priyanka, S.
Sunil Kumar, K.
Vijay Bhaskar Reddy, Y.
Berhanu, Afework Aemro
Stressed Speech Emotion Recognition Using Teager Energy and Spectral Feature Fusion with Feature Optimization
title Stressed Speech Emotion Recognition Using Teager Energy and Spectral Feature Fusion with Feature Optimization
title_full Stressed Speech Emotion Recognition Using Teager Energy and Spectral Feature Fusion with Feature Optimization
title_fullStr Stressed Speech Emotion Recognition Using Teager Energy and Spectral Feature Fusion with Feature Optimization
title_full_unstemmed Stressed Speech Emotion Recognition Using Teager Energy and Spectral Feature Fusion with Feature Optimization
title_short Stressed Speech Emotion Recognition Using Teager Energy and Spectral Feature Fusion with Feature Optimization
title_sort stressed speech emotion recognition using teager energy and spectral feature fusion with feature optimization
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10586421/
https://www.ncbi.nlm.nih.gov/pubmed/37868755
http://dx.doi.org/10.1155/2023/5765760
work_keys_str_mv AT bandelasurekhareddy stressedspeechemotionrecognitionusingteagerenergyandspectralfeaturefusionwithfeatureoptimization
AT sivapriyankas stressedspeechemotionrecognitionusingteagerenergyandspectralfeaturefusionwithfeatureoptimization
AT sunilkumark stressedspeechemotionrecognitionusingteagerenergyandspectralfeaturefusionwithfeatureoptimization
AT vijaybhaskarreddyy stressedspeechemotionrecognitionusingteagerenergyandspectralfeaturefusionwithfeatureoptimization
AT berhanuafeworkaemro stressedspeechemotionrecognitionusingteagerenergyandspectralfeaturefusionwithfeatureoptimization