Cargando…

The Emotion Probe: On the Universality of Cross-Linguistic and Cross-Gender Speech Emotion Recognition via Machine Learning

Machine Learning (ML) algorithms within a human–computer framework are the leading force in speech emotion recognition (SER). However, few studies explore cross-corpora aspects of SER; this work aims to explore the feasibility and characteristics of a cross-linguistic, cross-gender SER. Three ML cla...

Descripción completa

Detalles Bibliográficos
Autores principales: Costantini, Giovanni, Parada-Cabaleiro, Emilia, Casali, Daniele, Cesarini, Valerio
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9003467/
https://www.ncbi.nlm.nih.gov/pubmed/35408076
http://dx.doi.org/10.3390/s22072461
_version_ 1784686141232906240
author Costantini, Giovanni
Parada-Cabaleiro, Emilia
Casali, Daniele
Cesarini, Valerio
author_facet Costantini, Giovanni
Parada-Cabaleiro, Emilia
Casali, Daniele
Cesarini, Valerio
author_sort Costantini, Giovanni
collection PubMed
description Machine Learning (ML) algorithms within a human–computer framework are the leading force in speech emotion recognition (SER). However, few studies explore cross-corpora aspects of SER; this work aims to explore the feasibility and characteristics of a cross-linguistic, cross-gender SER. Three ML classifiers (SVM, Naïve Bayes and MLP) are applied to acoustic features, obtained through a procedure based on Kononenko’s discretization and correlation-based feature selection. The system encompasses five emotions (disgust, fear, happiness, anger and sadness), using the Emofilm database, comprised of short clips of English movies and the respective Italian and Spanish dubbed versions, for a total of 1115 annotated utterances. The results see MLP as the most effective classifier, with accuracies higher than 90% for single-language approaches, while the cross-language classifier still yields accuracies higher than 80%. The results show cross-gender tasks to be more difficult than those involving two languages, suggesting greater differences between emotions expressed by male versus female subjects than between different languages. Four feature domains, namely, RASTA, F0, MFCC and spectral energy, are algorithmically assessed as the most effective, refining existing literature and approaches based on standard sets. To our knowledge, this is one of the first studies encompassing cross-gender and cross-linguistic assessments on SER.
format Online
Article
Text
id pubmed-9003467
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-90034672022-04-13 The Emotion Probe: On the Universality of Cross-Linguistic and Cross-Gender Speech Emotion Recognition via Machine Learning Costantini, Giovanni Parada-Cabaleiro, Emilia Casali, Daniele Cesarini, Valerio Sensors (Basel) Article Machine Learning (ML) algorithms within a human–computer framework are the leading force in speech emotion recognition (SER). However, few studies explore cross-corpora aspects of SER; this work aims to explore the feasibility and characteristics of a cross-linguistic, cross-gender SER. Three ML classifiers (SVM, Naïve Bayes and MLP) are applied to acoustic features, obtained through a procedure based on Kononenko’s discretization and correlation-based feature selection. The system encompasses five emotions (disgust, fear, happiness, anger and sadness), using the Emofilm database, comprised of short clips of English movies and the respective Italian and Spanish dubbed versions, for a total of 1115 annotated utterances. The results see MLP as the most effective classifier, with accuracies higher than 90% for single-language approaches, while the cross-language classifier still yields accuracies higher than 80%. The results show cross-gender tasks to be more difficult than those involving two languages, suggesting greater differences between emotions expressed by male versus female subjects than between different languages. Four feature domains, namely, RASTA, F0, MFCC and spectral energy, are algorithmically assessed as the most effective, refining existing literature and approaches based on standard sets. To our knowledge, this is one of the first studies encompassing cross-gender and cross-linguistic assessments on SER. MDPI 2022-03-23 /pmc/articles/PMC9003467/ /pubmed/35408076 http://dx.doi.org/10.3390/s22072461 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Costantini, Giovanni
Parada-Cabaleiro, Emilia
Casali, Daniele
Cesarini, Valerio
The Emotion Probe: On the Universality of Cross-Linguistic and Cross-Gender Speech Emotion Recognition via Machine Learning
title The Emotion Probe: On the Universality of Cross-Linguistic and Cross-Gender Speech Emotion Recognition via Machine Learning
title_full The Emotion Probe: On the Universality of Cross-Linguistic and Cross-Gender Speech Emotion Recognition via Machine Learning
title_fullStr The Emotion Probe: On the Universality of Cross-Linguistic and Cross-Gender Speech Emotion Recognition via Machine Learning
title_full_unstemmed The Emotion Probe: On the Universality of Cross-Linguistic and Cross-Gender Speech Emotion Recognition via Machine Learning
title_short The Emotion Probe: On the Universality of Cross-Linguistic and Cross-Gender Speech Emotion Recognition via Machine Learning
title_sort emotion probe: on the universality of cross-linguistic and cross-gender speech emotion recognition via machine learning
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9003467/
https://www.ncbi.nlm.nih.gov/pubmed/35408076
http://dx.doi.org/10.3390/s22072461
work_keys_str_mv AT costantinigiovanni theemotionprobeontheuniversalityofcrosslinguisticandcrossgenderspeechemotionrecognitionviamachinelearning
AT paradacabaleiroemilia theemotionprobeontheuniversalityofcrosslinguisticandcrossgenderspeechemotionrecognitionviamachinelearning
AT casalidaniele theemotionprobeontheuniversalityofcrosslinguisticandcrossgenderspeechemotionrecognitionviamachinelearning
AT cesarinivalerio theemotionprobeontheuniversalityofcrosslinguisticandcrossgenderspeechemotionrecognitionviamachinelearning
AT costantinigiovanni emotionprobeontheuniversalityofcrosslinguisticandcrossgenderspeechemotionrecognitionviamachinelearning
AT paradacabaleiroemilia emotionprobeontheuniversalityofcrosslinguisticandcrossgenderspeechemotionrecognitionviamachinelearning
AT casalidaniele emotionprobeontheuniversalityofcrosslinguisticandcrossgenderspeechemotionrecognitionviamachinelearning
AT cesarinivalerio emotionprobeontheuniversalityofcrosslinguisticandcrossgenderspeechemotionrecognitionviamachinelearning