Cargando…
Stressed Speech Emotion Recognition Using Teager Energy and Spectral Feature Fusion with Feature Optimization
The objective of speech emotion recognition (SER) is to enhance man–machine interface. It can also be used to cover the physiological state of a person in critical situations. In recent time, speech emotion recognition also finds its operations in medicine and forensics. A new feature extraction tec...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10586421/ https://www.ncbi.nlm.nih.gov/pubmed/37868755 http://dx.doi.org/10.1155/2023/5765760 |
_version_ | 1785123156175880192 |
---|---|
author | Bandela, Surekha Reddy Siva Priyanka, S. Sunil Kumar, K. Vijay Bhaskar Reddy, Y. Berhanu, Afework Aemro |
author_facet | Bandela, Surekha Reddy Siva Priyanka, S. Sunil Kumar, K. Vijay Bhaskar Reddy, Y. Berhanu, Afework Aemro |
author_sort | Bandela, Surekha Reddy |
collection | PubMed |
description | The objective of speech emotion recognition (SER) is to enhance man–machine interface. It can also be used to cover the physiological state of a person in critical situations. In recent time, speech emotion recognition also finds its operations in medicine and forensics. A new feature extraction technique using Teager energy operator (TEO) is proposed for the detection of stressed emotions as Teager energy-autocorrelation envelope (TEO-Auto-Env). TEO is basically designed for increasing the energies of the stressed speech signals whose energies are reduced during the speech production process and hence used in this analysis. A stressed speech emotion recognition (SSER) system is developed using TEO-Auto-Env and spectral feature combination for detecting the emotions. The spectral features considered are Mel-frequency cepstral coefficients (MFCC), linear prediction cepstral coefficients (LPCC), and relative spectra–perceptual linear prediction (RASTA-PLP). EMO-DB (German), EMOVO (Italian), IITKGP (Telugu), and EMA (English) databases are used in this analysis. The classification of the emotions is carried out using the k-nearest neighborhood (k-NN) classifier for gender-dependent (GD) and speaker-independent (SI) cases. The proposed SSER system provides improved accuracy compared to the existing ones. Average recall is used for performance evaluation. The highest classification accuracy is achieved using the feature combination of TEO-Auto-Env, MFCC, and LPCC features with 91.4% (SI), 91.4% (GD-male), and 93.1%(GD-female) for EMO-DB; 68.5% (SI), 68.5% (GD-male), and 74.6% (GD-female) for EMOVO; 90.6%(SI), 91% (GD-male), and 92.3% (GD-female) for EMA; and 95.1% (GD-female) for IITKGP female database. |
format | Online Article Text |
id | pubmed-10586421 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Hindawi |
record_format | MEDLINE/PubMed |
spelling | pubmed-105864212023-10-20 Stressed Speech Emotion Recognition Using Teager Energy and Spectral Feature Fusion with Feature Optimization Bandela, Surekha Reddy Siva Priyanka, S. Sunil Kumar, K. Vijay Bhaskar Reddy, Y. Berhanu, Afework Aemro Comput Intell Neurosci Research Article The objective of speech emotion recognition (SER) is to enhance man–machine interface. It can also be used to cover the physiological state of a person in critical situations. In recent time, speech emotion recognition also finds its operations in medicine and forensics. A new feature extraction technique using Teager energy operator (TEO) is proposed for the detection of stressed emotions as Teager energy-autocorrelation envelope (TEO-Auto-Env). TEO is basically designed for increasing the energies of the stressed speech signals whose energies are reduced during the speech production process and hence used in this analysis. A stressed speech emotion recognition (SSER) system is developed using TEO-Auto-Env and spectral feature combination for detecting the emotions. The spectral features considered are Mel-frequency cepstral coefficients (MFCC), linear prediction cepstral coefficients (LPCC), and relative spectra–perceptual linear prediction (RASTA-PLP). EMO-DB (German), EMOVO (Italian), IITKGP (Telugu), and EMA (English) databases are used in this analysis. The classification of the emotions is carried out using the k-nearest neighborhood (k-NN) classifier for gender-dependent (GD) and speaker-independent (SI) cases. The proposed SSER system provides improved accuracy compared to the existing ones. Average recall is used for performance evaluation. The highest classification accuracy is achieved using the feature combination of TEO-Auto-Env, MFCC, and LPCC features with 91.4% (SI), 91.4% (GD-male), and 93.1%(GD-female) for EMO-DB; 68.5% (SI), 68.5% (GD-male), and 74.6% (GD-female) for EMOVO; 90.6%(SI), 91% (GD-male), and 92.3% (GD-female) for EMA; and 95.1% (GD-female) for IITKGP female database. Hindawi 2023-10-11 /pmc/articles/PMC10586421/ /pubmed/37868755 http://dx.doi.org/10.1155/2023/5765760 Text en Copyright © 2023 Surekha Reddy Bandela et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Bandela, Surekha Reddy Siva Priyanka, S. Sunil Kumar, K. Vijay Bhaskar Reddy, Y. Berhanu, Afework Aemro Stressed Speech Emotion Recognition Using Teager Energy and Spectral Feature Fusion with Feature Optimization |
title | Stressed Speech Emotion Recognition Using Teager Energy and Spectral Feature Fusion with Feature Optimization |
title_full | Stressed Speech Emotion Recognition Using Teager Energy and Spectral Feature Fusion with Feature Optimization |
title_fullStr | Stressed Speech Emotion Recognition Using Teager Energy and Spectral Feature Fusion with Feature Optimization |
title_full_unstemmed | Stressed Speech Emotion Recognition Using Teager Energy and Spectral Feature Fusion with Feature Optimization |
title_short | Stressed Speech Emotion Recognition Using Teager Energy and Spectral Feature Fusion with Feature Optimization |
title_sort | stressed speech emotion recognition using teager energy and spectral feature fusion with feature optimization |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10586421/ https://www.ncbi.nlm.nih.gov/pubmed/37868755 http://dx.doi.org/10.1155/2023/5765760 |
work_keys_str_mv | AT bandelasurekhareddy stressedspeechemotionrecognitionusingteagerenergyandspectralfeaturefusionwithfeatureoptimization AT sivapriyankas stressedspeechemotionrecognitionusingteagerenergyandspectralfeaturefusionwithfeatureoptimization AT sunilkumark stressedspeechemotionrecognitionusingteagerenergyandspectralfeaturefusionwithfeatureoptimization AT vijaybhaskarreddyy stressedspeechemotionrecognitionusingteagerenergyandspectralfeaturefusionwithfeatureoptimization AT berhanuafeworkaemro stressedspeechemotionrecognitionusingteagerenergyandspectralfeaturefusionwithfeatureoptimization |