Cargando…

Perception and classification of emotions in nonsense speech: Humans versus machines

This article contributes to a more adequate modelling of emotions encoded in speech, by addressing four fallacies prevalent in traditional affective computing: First, studies concentrate on few emotions and disregard all other ones (‘closed world’). Second, studies use clean (lab) data or real-life...

Descripción completa

Detalles Bibliográficos
Autores principales: Parada-Cabaleiro, Emilia, Batliner, Anton, Schmitt, Maximilian, Schedl, Markus, Costantini, Giovanni, Schuller, Björn
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9886254/
https://www.ncbi.nlm.nih.gov/pubmed/36716307
http://dx.doi.org/10.1371/journal.pone.0281079
_version_ 1784880095346819072
author Parada-Cabaleiro, Emilia
Batliner, Anton
Schmitt, Maximilian
Schedl, Markus
Costantini, Giovanni
Schuller, Björn
author_facet Parada-Cabaleiro, Emilia
Batliner, Anton
Schmitt, Maximilian
Schedl, Markus
Costantini, Giovanni
Schuller, Björn
author_sort Parada-Cabaleiro, Emilia
collection PubMed
description This article contributes to a more adequate modelling of emotions encoded in speech, by addressing four fallacies prevalent in traditional affective computing: First, studies concentrate on few emotions and disregard all other ones (‘closed world’). Second, studies use clean (lab) data or real-life ones but do not compare clean and noisy data in a comparable setting (‘clean world’). Third, machine learning approaches need large amounts of data; however, their performance has not yet been assessed by systematically comparing different approaches and different sizes of databases (‘small world’). Fourth, although human annotations of emotion constitute the basis for automatic classification, human perception and machine classification have not yet been compared on a strict basis (‘one world’). Finally, we deal with the intrinsic ambiguities of emotions by interpreting the confusions between categories (‘fuzzy world’). We use acted nonsense speech from the GEMEP corpus, emotional ‘distractors’ as categories not entailed in the test set, real-life noises that mask the clear recordings, and different sizes of the training set for machine learning. We show that machine learning based on state-of-the-art feature representations (wav2vec2) is able to mirror the main emotional categories (‘pillars’) present in perceptual emotional constellations even in degradated acoustic conditions.
format Online
Article
Text
id pubmed-9886254
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-98862542023-01-31 Perception and classification of emotions in nonsense speech: Humans versus machines Parada-Cabaleiro, Emilia Batliner, Anton Schmitt, Maximilian Schedl, Markus Costantini, Giovanni Schuller, Björn PLoS One Research Article This article contributes to a more adequate modelling of emotions encoded in speech, by addressing four fallacies prevalent in traditional affective computing: First, studies concentrate on few emotions and disregard all other ones (‘closed world’). Second, studies use clean (lab) data or real-life ones but do not compare clean and noisy data in a comparable setting (‘clean world’). Third, machine learning approaches need large amounts of data; however, their performance has not yet been assessed by systematically comparing different approaches and different sizes of databases (‘small world’). Fourth, although human annotations of emotion constitute the basis for automatic classification, human perception and machine classification have not yet been compared on a strict basis (‘one world’). Finally, we deal with the intrinsic ambiguities of emotions by interpreting the confusions between categories (‘fuzzy world’). We use acted nonsense speech from the GEMEP corpus, emotional ‘distractors’ as categories not entailed in the test set, real-life noises that mask the clear recordings, and different sizes of the training set for machine learning. We show that machine learning based on state-of-the-art feature representations (wav2vec2) is able to mirror the main emotional categories (‘pillars’) present in perceptual emotional constellations even in degradated acoustic conditions. Public Library of Science 2023-01-30 /pmc/articles/PMC9886254/ /pubmed/36716307 http://dx.doi.org/10.1371/journal.pone.0281079 Text en © 2023 Parada-Cabaleiro et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Parada-Cabaleiro, Emilia
Batliner, Anton
Schmitt, Maximilian
Schedl, Markus
Costantini, Giovanni
Schuller, Björn
Perception and classification of emotions in nonsense speech: Humans versus machines
title Perception and classification of emotions in nonsense speech: Humans versus machines
title_full Perception and classification of emotions in nonsense speech: Humans versus machines
title_fullStr Perception and classification of emotions in nonsense speech: Humans versus machines
title_full_unstemmed Perception and classification of emotions in nonsense speech: Humans versus machines
title_short Perception and classification of emotions in nonsense speech: Humans versus machines
title_sort perception and classification of emotions in nonsense speech: humans versus machines
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9886254/
https://www.ncbi.nlm.nih.gov/pubmed/36716307
http://dx.doi.org/10.1371/journal.pone.0281079
work_keys_str_mv AT paradacabaleiroemilia perceptionandclassificationofemotionsinnonsensespeechhumansversusmachines
AT batlineranton perceptionandclassificationofemotionsinnonsensespeechhumansversusmachines
AT schmittmaximilian perceptionandclassificationofemotionsinnonsensespeechhumansversusmachines
AT schedlmarkus perceptionandclassificationofemotionsinnonsensespeechhumansversusmachines
AT costantinigiovanni perceptionandclassificationofemotionsinnonsensespeechhumansversusmachines
AT schullerbjorn perceptionandclassificationofemotionsinnonsensespeechhumansversusmachines