Cargando…

Shared acoustic codes underlie emotional communication in music and speech—Evidence from deep transfer learning

Music and speech exhibit striking similarities in the communication of emotions in the acoustic domain, in such a way that the communication of specific emotions is achieved, at least to a certain extent, by means of shared acoustic patterns. From an Affective Sciences points of view, determining th...

Descripción completa

Detalles Bibliográficos
Autores principales:	Coutinho, Eduardo, Schuller, Björn
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2017
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5489171/ https://www.ncbi.nlm.nih.gov/pubmed/28658285 http://dx.doi.org/10.1371/journal.pone.0179289

_version_	1783246757835046912
author	Coutinho, Eduardo Schuller, Björn
author_facet	Coutinho, Eduardo Schuller, Björn
author_sort	Coutinho, Eduardo
collection	PubMed
description	Music and speech exhibit striking similarities in the communication of emotions in the acoustic domain, in such a way that the communication of specific emotions is achieved, at least to a certain extent, by means of shared acoustic patterns. From an Affective Sciences points of view, determining the degree of overlap between both domains is fundamental to understand the shared mechanisms underlying such phenomenon. From a Machine learning perspective, the overlap between acoustic codes for emotional expression in music and speech opens new possibilities to enlarge the amount of data available to develop music and speech emotion recognition systems. In this article, we investigate time-continuous predictions of emotion (Arousal and Valence) in music and speech, and the Transfer Learning between these domains. We establish a comparative framework including intra- (i.e., models trained and tested on the same modality, either music or speech) and cross-domain experiments (i.e., models trained in one modality and tested on the other). In the cross-domain context, we evaluated two strategies—the direct transfer between domains, and the contribution of Transfer Learning techniques (feature-representation-transfer based on Denoising Auto Encoders) for reducing the gap in the feature space distributions. Our results demonstrate an excellent cross-domain generalisation performance with and without feature representation transfer in both directions. In the case of music, cross-domain approaches outperformed intra-domain models for Valence estimation, whereas for Speech intra-domain models achieve the best performance. This is the first demonstration of shared acoustic codes for emotional expression in music and speech in the time-continuous domain.
format	Online Article Text
id	pubmed-5489171
institution	National Center for Biotechnology Information
language	English
publishDate	2017
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-54891712017-07-11 Shared acoustic codes underlie emotional communication in music and speech—Evidence from deep transfer learning Coutinho, Eduardo Schuller, Björn PLoS One Research Article Music and speech exhibit striking similarities in the communication of emotions in the acoustic domain, in such a way that the communication of specific emotions is achieved, at least to a certain extent, by means of shared acoustic patterns. From an Affective Sciences points of view, determining the degree of overlap between both domains is fundamental to understand the shared mechanisms underlying such phenomenon. From a Machine learning perspective, the overlap between acoustic codes for emotional expression in music and speech opens new possibilities to enlarge the amount of data available to develop music and speech emotion recognition systems. In this article, we investigate time-continuous predictions of emotion (Arousal and Valence) in music and speech, and the Transfer Learning between these domains. We establish a comparative framework including intra- (i.e., models trained and tested on the same modality, either music or speech) and cross-domain experiments (i.e., models trained in one modality and tested on the other). In the cross-domain context, we evaluated two strategies—the direct transfer between domains, and the contribution of Transfer Learning techniques (feature-representation-transfer based on Denoising Auto Encoders) for reducing the gap in the feature space distributions. Our results demonstrate an excellent cross-domain generalisation performance with and without feature representation transfer in both directions. In the case of music, cross-domain approaches outperformed intra-domain models for Valence estimation, whereas for Speech intra-domain models achieve the best performance. This is the first demonstration of shared acoustic codes for emotional expression in music and speech in the time-continuous domain. Public Library of Science 2017-06-28 /pmc/articles/PMC5489171/ /pubmed/28658285 http://dx.doi.org/10.1371/journal.pone.0179289 Text en © 2017 Coutinho, Schuller http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle	Research Article Coutinho, Eduardo Schuller, Björn Shared acoustic codes underlie emotional communication in music and speech—Evidence from deep transfer learning
title	Shared acoustic codes underlie emotional communication in music and speech—Evidence from deep transfer learning
title_full	Shared acoustic codes underlie emotional communication in music and speech—Evidence from deep transfer learning
title_fullStr	Shared acoustic codes underlie emotional communication in music and speech—Evidence from deep transfer learning
title_full_unstemmed	Shared acoustic codes underlie emotional communication in music and speech—Evidence from deep transfer learning
title_short	Shared acoustic codes underlie emotional communication in music and speech—Evidence from deep transfer learning
title_sort	shared acoustic codes underlie emotional communication in music and speech—evidence from deep transfer learning
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5489171/ https://www.ncbi.nlm.nih.gov/pubmed/28658285 http://dx.doi.org/10.1371/journal.pone.0179289
work_keys_str_mv	AT coutinhoeduardo sharedacousticcodesunderlieemotionalcommunicationinmusicandspeechevidencefromdeeptransferlearning AT schullerbjorn sharedacousticcodesunderlieemotionalcommunicationinmusicandspeechevidencefromdeeptransferlearning

Shared acoustic codes underlie emotional communication in music and speech—Evidence from deep transfer learning

Ejemplares similares