Cargando…

Articulation constrained learning with application to speech emotion recognition

Speech emotion recognition methods combining articulatory information with acoustic features have been previously shown to improve recognition performance. Collection of articulatory data on a large scale may not be feasible in many scenarios, thus restricting the scope and applicability of such met...

Descripción completa

Detalles Bibliográficos
Autores principales:	Shah, Mohit, Tu, Ming, Berisha, Visar, Chakrabarti, Chaitali, Spanias, Andreas
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Springer International Publishing 2019
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6919554/ https://www.ncbi.nlm.nih.gov/pubmed/31853252 http://dx.doi.org/10.1186/s13636-019-0157-9

_version_	1783480774518898688
author	Shah, Mohit Tu, Ming Berisha, Visar Chakrabarti, Chaitali Spanias, Andreas
author_facet	Shah, Mohit Tu, Ming Berisha, Visar Chakrabarti, Chaitali Spanias, Andreas
author_sort	Shah, Mohit
collection	PubMed
description	Speech emotion recognition methods combining articulatory information with acoustic features have been previously shown to improve recognition performance. Collection of articulatory data on a large scale may not be feasible in many scenarios, thus restricting the scope and applicability of such methods. In this paper, a discriminative learning method for emotion recognition using both articulatory and acoustic information is proposed. A traditional ℓ(1)-regularized logistic regression cost function is extended to include additional constraints that enforce the model to reconstruct articulatory data. This leads to sparse and interpretable representations jointly optimized for both tasks simultaneously. Furthermore, the model only requires articulatory features during training; only speech features are required for inference on out-of-sample data. Experiments are conducted to evaluate emotion recognition performance over vowels /AA/, /AE/, /IY/, /UW/ and complete utterances. Incorporating articulatory information is shown to significantly improve the performance for valence-based classification. Results obtained for within-corpus and cross-corpus categorical emotion recognition indicate that the proposed method is more effective at distinguishing happiness from other emotions.
format	Online Article Text
id	pubmed-6919554
institution	National Center for Biotechnology Information
language	English
publishDate	2019
publisher	Springer International Publishing
record_format	MEDLINE/PubMed
spelling	pubmed-69195542019-12-18 Articulation constrained learning with application to speech emotion recognition Shah, Mohit Tu, Ming Berisha, Visar Chakrabarti, Chaitali Spanias, Andreas EURASIP J Audio Speech Music Process Research Speech emotion recognition methods combining articulatory information with acoustic features have been previously shown to improve recognition performance. Collection of articulatory data on a large scale may not be feasible in many scenarios, thus restricting the scope and applicability of such methods. In this paper, a discriminative learning method for emotion recognition using both articulatory and acoustic information is proposed. A traditional ℓ(1)-regularized logistic regression cost function is extended to include additional constraints that enforce the model to reconstruct articulatory data. This leads to sparse and interpretable representations jointly optimized for both tasks simultaneously. Furthermore, the model only requires articulatory features during training; only speech features are required for inference on out-of-sample data. Experiments are conducted to evaluate emotion recognition performance over vowels /AA/, /AE/, /IY/, /UW/ and complete utterances. Incorporating articulatory information is shown to significantly improve the performance for valence-based classification. Results obtained for within-corpus and cross-corpus categorical emotion recognition indicate that the proposed method is more effective at distinguishing happiness from other emotions. Springer International Publishing 2019-08-20 2019 /pmc/articles/PMC6919554/ /pubmed/31853252 http://dx.doi.org/10.1186/s13636-019-0157-9 Text en © The Author(s) 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
spellingShingle	Research Shah, Mohit Tu, Ming Berisha, Visar Chakrabarti, Chaitali Spanias, Andreas Articulation constrained learning with application to speech emotion recognition
title	Articulation constrained learning with application to speech emotion recognition
title_full	Articulation constrained learning with application to speech emotion recognition
title_fullStr	Articulation constrained learning with application to speech emotion recognition
title_full_unstemmed	Articulation constrained learning with application to speech emotion recognition
title_short	Articulation constrained learning with application to speech emotion recognition
title_sort	articulation constrained learning with application to speech emotion recognition
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6919554/ https://www.ncbi.nlm.nih.gov/pubmed/31853252 http://dx.doi.org/10.1186/s13636-019-0157-9
work_keys_str_mv	AT shahmohit articulationconstrainedlearningwithapplicationtospeechemotionrecognition AT tuming articulationconstrainedlearningwithapplicationtospeechemotionrecognition AT berishavisar articulationconstrainedlearningwithapplicationtospeechemotionrecognition AT chakrabartichaitali articulationconstrainedlearningwithapplicationtospeechemotionrecognition AT spaniasandreas articulationconstrainedlearningwithapplicationtospeechemotionrecognition

Articulation constrained learning with application to speech emotion recognition

Ejemplares similares