Cargando…

Comparing supervised and unsupervised approaches to multimodal emotion recognition

We investigated emotion classification from brief video recordings from the GEMEP database wherein actors portrayed 18 emotions. Vocal features consisted of acoustic parameters related to frequency, intensity, spectral distribution, and durations. Facial features consisted of facial action units. We...

Descripción completa

Detalles Bibliográficos
Autores principales:	Fernández Carbonell, Marcos, Boman, Magnus, Laukka, Petri
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	PeerJ Inc. 2021
Materias:	Computer Vision
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8725659/ https://www.ncbi.nlm.nih.gov/pubmed/35036530 http://dx.doi.org/10.7717/peerj-cs.804

_version_	1784626161570021376
author	Fernández Carbonell, Marcos Boman, Magnus Laukka, Petri
author_facet	Fernández Carbonell, Marcos Boman, Magnus Laukka, Petri
author_sort	Fernández Carbonell, Marcos
collection	PubMed
description	We investigated emotion classification from brief video recordings from the GEMEP database wherein actors portrayed 18 emotions. Vocal features consisted of acoustic parameters related to frequency, intensity, spectral distribution, and durations. Facial features consisted of facial action units. We first performed a series of person-independent supervised classification experiments. Best performance (AUC = 0.88) was obtained by merging the output from the best unimodal vocal (Elastic Net, AUC = 0.82) and facial (Random Forest, AUC = 0.80) classifiers using a late fusion approach and the product rule method. All 18 emotions were recognized with above-chance recall, although recognition rates varied widely across emotions (e.g., high for amusement, anger, and disgust; and low for shame). Multimodal feature patterns for each emotion are described in terms of the vocal and facial features that contributed most to classifier performance. Next, a series of exploratory unsupervised classification experiments were performed to gain more insight into how emotion expressions are organized. Solutions from traditional clustering techniques were interpreted using decision trees in order to explore which features underlie clustering. Another approach utilized various dimensionality reduction techniques paired with inspection of data visualizations. Unsupervised methods did not cluster stimuli in terms of emotion categories, but several explanatory patterns were observed. Some could be interpreted in terms of valence and arousal, but actor and gender specific aspects also contributed to clustering. Identifying explanatory patterns holds great potential as a meta-heuristic when unsupervised methods are used in complex classification tasks.
format	Online Article Text
id	pubmed-8725659
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	PeerJ Inc.
record_format	MEDLINE/PubMed
spelling	pubmed-87256592022-01-14 Comparing supervised and unsupervised approaches to multimodal emotion recognition Fernández Carbonell, Marcos Boman, Magnus Laukka, Petri PeerJ Comput Sci Computer Vision We investigated emotion classification from brief video recordings from the GEMEP database wherein actors portrayed 18 emotions. Vocal features consisted of acoustic parameters related to frequency, intensity, spectral distribution, and durations. Facial features consisted of facial action units. We first performed a series of person-independent supervised classification experiments. Best performance (AUC = 0.88) was obtained by merging the output from the best unimodal vocal (Elastic Net, AUC = 0.82) and facial (Random Forest, AUC = 0.80) classifiers using a late fusion approach and the product rule method. All 18 emotions were recognized with above-chance recall, although recognition rates varied widely across emotions (e.g., high for amusement, anger, and disgust; and low for shame). Multimodal feature patterns for each emotion are described in terms of the vocal and facial features that contributed most to classifier performance. Next, a series of exploratory unsupervised classification experiments were performed to gain more insight into how emotion expressions are organized. Solutions from traditional clustering techniques were interpreted using decision trees in order to explore which features underlie clustering. Another approach utilized various dimensionality reduction techniques paired with inspection of data visualizations. Unsupervised methods did not cluster stimuli in terms of emotion categories, but several explanatory patterns were observed. Some could be interpreted in terms of valence and arousal, but actor and gender specific aspects also contributed to clustering. Identifying explanatory patterns holds great potential as a meta-heuristic when unsupervised methods are used in complex classification tasks. PeerJ Inc. 2021-12-24 /pmc/articles/PMC8725659/ /pubmed/35036530 http://dx.doi.org/10.7717/peerj-cs.804 Text en © 2021 Fernández Carbonell et al. https://creativecommons.org/licenses/by-nc/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by-nc/4.0/) , which permits using, remixing, and building upon the work non-commercially, as long as it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.
spellingShingle	Computer Vision Fernández Carbonell, Marcos Boman, Magnus Laukka, Petri Comparing supervised and unsupervised approaches to multimodal emotion recognition
title	Comparing supervised and unsupervised approaches to multimodal emotion recognition
title_full	Comparing supervised and unsupervised approaches to multimodal emotion recognition
title_fullStr	Comparing supervised and unsupervised approaches to multimodal emotion recognition
title_full_unstemmed	Comparing supervised and unsupervised approaches to multimodal emotion recognition
title_short	Comparing supervised and unsupervised approaches to multimodal emotion recognition
title_sort	comparing supervised and unsupervised approaches to multimodal emotion recognition
topic	Computer Vision
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8725659/ https://www.ncbi.nlm.nih.gov/pubmed/35036530 http://dx.doi.org/10.7717/peerj-cs.804
work_keys_str_mv	AT fernandezcarbonellmarcos comparingsupervisedandunsupervisedapproachestomultimodalemotionrecognition AT bomanmagnus comparingsupervisedandunsupervisedapproachestomultimodalemotionrecognition AT laukkapetri comparingsupervisedandunsupervisedapproachestomultimodalemotionrecognition

Comparing supervised and unsupervised approaches to multimodal emotion recognition

Ejemplares similares