Cargando…

Voice Quality Modelling for Expressive Speech Synthesis

This paper presents the perceptual experiments that were carried out in order to validate the methodology of transforming expressive speech styles using voice quality (VoQ) parameters modelling, along with the well-known prosody (F (0), duration, and energy), from a neutral style into a number of ex...

Descripción completa

Detalles Bibliográficos
Autores principales: Monzo, Carlos, Iriondo, Ignasi, Socoró, Joan Claudi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Hindawi Publishing Corporation 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3920859/
https://www.ncbi.nlm.nih.gov/pubmed/24587738
http://dx.doi.org/10.1155/2014/627189
_version_ 1782303237889589248
author Monzo, Carlos
Iriondo, Ignasi
Socoró, Joan Claudi
author_facet Monzo, Carlos
Iriondo, Ignasi
Socoró, Joan Claudi
author_sort Monzo, Carlos
collection PubMed
description This paper presents the perceptual experiments that were carried out in order to validate the methodology of transforming expressive speech styles using voice quality (VoQ) parameters modelling, along with the well-known prosody (F (0), duration, and energy), from a neutral style into a number of expressive ones. The main goal was to validate the usefulness of VoQ in the enhancement of expressive synthetic speech in terms of speech quality and style identification. A harmonic plus noise model (HNM) was used to modify VoQ and prosodic parameters that were extracted from an expressive speech corpus. Perception test results indicated the improvement of obtained expressive speech styles using VoQ modelling along with prosodic characteristics.
format Online
Article
Text
id pubmed-3920859
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Hindawi Publishing Corporation
record_format MEDLINE/PubMed
spelling pubmed-39208592014-03-02 Voice Quality Modelling for Expressive Speech Synthesis Monzo, Carlos Iriondo, Ignasi Socoró, Joan Claudi ScientificWorldJournal Research Article This paper presents the perceptual experiments that were carried out in order to validate the methodology of transforming expressive speech styles using voice quality (VoQ) parameters modelling, along with the well-known prosody (F (0), duration, and energy), from a neutral style into a number of expressive ones. The main goal was to validate the usefulness of VoQ in the enhancement of expressive synthetic speech in terms of speech quality and style identification. A harmonic plus noise model (HNM) was used to modify VoQ and prosodic parameters that were extracted from an expressive speech corpus. Perception test results indicated the improvement of obtained expressive speech styles using VoQ modelling along with prosodic characteristics. Hindawi Publishing Corporation 2014-01-22 /pmc/articles/PMC3920859/ /pubmed/24587738 http://dx.doi.org/10.1155/2014/627189 Text en Copyright © 2014 Carlos Monzo et al. https://creativecommons.org/licenses/by/3.0/ This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Monzo, Carlos
Iriondo, Ignasi
Socoró, Joan Claudi
Voice Quality Modelling for Expressive Speech Synthesis
title Voice Quality Modelling for Expressive Speech Synthesis
title_full Voice Quality Modelling for Expressive Speech Synthesis
title_fullStr Voice Quality Modelling for Expressive Speech Synthesis
title_full_unstemmed Voice Quality Modelling for Expressive Speech Synthesis
title_short Voice Quality Modelling for Expressive Speech Synthesis
title_sort voice quality modelling for expressive speech synthesis
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3920859/
https://www.ncbi.nlm.nih.gov/pubmed/24587738
http://dx.doi.org/10.1155/2014/627189
work_keys_str_mv AT monzocarlos voicequalitymodellingforexpressivespeechsynthesis
AT iriondoignasi voicequalitymodellingforexpressivespeechsynthesis
AT socorojoanclaudi voicequalitymodellingforexpressivespeechsynthesis