Cargando…

Exploring Prosodic Features Modelling for Secondary Emotions Needed for Empathetic Speech Synthesis

A low-resource emotional speech synthesis system for empathetic speech synthesis based on modelling prosody features is presented here. Secondary emotions, identified to be needed for empathetic speech, are modelled and synthesised in this investigation. As secondary emotions are subtle in nature, t...

Descripción completa

Detalles Bibliográficos
Autores principales:	James, Jesin, B.T., Balamurali, Watson, Catherine, Mixdorff, Hansjörg
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10053518/ https://www.ncbi.nlm.nih.gov/pubmed/36991710 http://dx.doi.org/10.3390/s23062999

_version_	1785015432536653824
author	James, Jesin B.T., Balamurali Watson, Catherine Mixdorff, Hansjörg
author_facet	James, Jesin B.T., Balamurali Watson, Catherine Mixdorff, Hansjörg
author_sort	James, Jesin
collection	PubMed
description	A low-resource emotional speech synthesis system for empathetic speech synthesis based on modelling prosody features is presented here. Secondary emotions, identified to be needed for empathetic speech, are modelled and synthesised in this investigation. As secondary emotions are subtle in nature, they are difficult to model compared to primary emotions. This study is one of the few to model secondary emotions in speech as they have not been extensively studied so far. Current speech synthesis research uses large databases and deep learning techniques to develop emotion models. There are many secondary emotions, and hence, developing large databases for each of the secondary emotions is expensive. Hence, this research presents a proof of concept using handcrafted feature extraction and modelling of these features using a low-resource-intensive machine learning approach, thus creating synthetic speech with secondary emotions. Here, a quantitative-model-based transformation is used to shape the emotional speech’s fundamental frequency contour. Speech rate and mean intensity are modelled via rule-based approaches. Using these models, an emotional text-to-speech synthesis system to synthesise five secondary emotions-anxious, apologetic, confident, enthusiastic and worried-is developed. A perception test to evaluate the synthesised emotional speech is also conducted. The participants could identify the correct emotion in a forced response test with a hit rate greater than 65%.
format	Online Article Text
id	pubmed-10053518
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-100535182023-03-30 Exploring Prosodic Features Modelling for Secondary Emotions Needed for Empathetic Speech Synthesis James, Jesin B.T., Balamurali Watson, Catherine Mixdorff, Hansjörg Sensors (Basel) Article A low-resource emotional speech synthesis system for empathetic speech synthesis based on modelling prosody features is presented here. Secondary emotions, identified to be needed for empathetic speech, are modelled and synthesised in this investigation. As secondary emotions are subtle in nature, they are difficult to model compared to primary emotions. This study is one of the few to model secondary emotions in speech as they have not been extensively studied so far. Current speech synthesis research uses large databases and deep learning techniques to develop emotion models. There are many secondary emotions, and hence, developing large databases for each of the secondary emotions is expensive. Hence, this research presents a proof of concept using handcrafted feature extraction and modelling of these features using a low-resource-intensive machine learning approach, thus creating synthetic speech with secondary emotions. Here, a quantitative-model-based transformation is used to shape the emotional speech’s fundamental frequency contour. Speech rate and mean intensity are modelled via rule-based approaches. Using these models, an emotional text-to-speech synthesis system to synthesise five secondary emotions-anxious, apologetic, confident, enthusiastic and worried-is developed. A perception test to evaluate the synthesised emotional speech is also conducted. The participants could identify the correct emotion in a forced response test with a hit rate greater than 65%. MDPI 2023-03-10 /pmc/articles/PMC10053518/ /pubmed/36991710 http://dx.doi.org/10.3390/s23062999 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article James, Jesin B.T., Balamurali Watson, Catherine Mixdorff, Hansjörg Exploring Prosodic Features Modelling for Secondary Emotions Needed for Empathetic Speech Synthesis
title	Exploring Prosodic Features Modelling for Secondary Emotions Needed for Empathetic Speech Synthesis
title_full	Exploring Prosodic Features Modelling for Secondary Emotions Needed for Empathetic Speech Synthesis
title_fullStr	Exploring Prosodic Features Modelling for Secondary Emotions Needed for Empathetic Speech Synthesis
title_full_unstemmed	Exploring Prosodic Features Modelling for Secondary Emotions Needed for Empathetic Speech Synthesis
title_short	Exploring Prosodic Features Modelling for Secondary Emotions Needed for Empathetic Speech Synthesis
title_sort	exploring prosodic features modelling for secondary emotions needed for empathetic speech synthesis
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10053518/ https://www.ncbi.nlm.nih.gov/pubmed/36991710 http://dx.doi.org/10.3390/s23062999
work_keys_str_mv	AT jamesjesin exploringprosodicfeaturesmodellingforsecondaryemotionsneededforempatheticspeechsynthesis AT btbalamurali exploringprosodicfeaturesmodellingforsecondaryemotionsneededforempatheticspeechsynthesis AT watsoncatherine exploringprosodicfeaturesmodellingforsecondaryemotionsneededforempatheticspeechsynthesis AT mixdorffhansjorg exploringprosodicfeaturesmodellingforsecondaryemotionsneededforempatheticspeechsynthesis

Exploring Prosodic Features Modelling for Secondary Emotions Needed for Empathetic Speech Synthesis

Ejemplares similares