Cargando…

Quality of synthetic speech: perceptual dimensions, influencing factors, and instrumental assessment

This book reviews research towards perceptual quality dimensions of synthetic speech, compares these findings with the state of the art, and derives a set of five universal perceptual quality dimensions for TTS signals. They are: (i) naturalness of voice, (ii) prosodic quality, (iii) fluency and int...

Descripción completa

Detalles Bibliográficos
Autor principal: Hinterleitner, Florian
Lenguaje:eng
Publicado: Springer 2017
Materias:
Acceso en línea:https://dx.doi.org/10.1007/978-981-10-3734-4
http://cds.cern.ch/record/2262190
_version_ 1780954098428805120
author Hinterleitner, Florian
author_facet Hinterleitner, Florian
author_sort Hinterleitner, Florian
collection CERN
description This book reviews research towards perceptual quality dimensions of synthetic speech, compares these findings with the state of the art, and derives a set of five universal perceptual quality dimensions for TTS signals. They are: (i) naturalness of voice, (ii) prosodic quality, (iii) fluency and intelligibility, (iv) absence of disturbances, and (v) calmness. Moreover, a test protocol for the efficient indentification of those dimensions in a listening test is introduced. Furthermore, several factors influencing these dimensions are examined. In addition, different techniques for the instrumental quality assessment of TTS signals are introduced, reviewed and tested. Finally, the requirements for the integration of an instrumental quality measure into a concatenative TTS system are examined.
id cern-2262190
institution Organización Europea para la Investigación Nuclear
language eng
publishDate 2017
publisher Springer
record_format invenio
spelling cern-22621902021-04-21T19:15:21Zdoi:10.1007/978-981-10-3734-4http://cds.cern.ch/record/2262190engHinterleitner, FlorianQuality of synthetic speech: perceptual dimensions, influencing factors, and instrumental assessmentEngineeringThis book reviews research towards perceptual quality dimensions of synthetic speech, compares these findings with the state of the art, and derives a set of five universal perceptual quality dimensions for TTS signals. They are: (i) naturalness of voice, (ii) prosodic quality, (iii) fluency and intelligibility, (iv) absence of disturbances, and (v) calmness. Moreover, a test protocol for the efficient indentification of those dimensions in a listening test is introduced. Furthermore, several factors influencing these dimensions are examined. In addition, different techniques for the instrumental quality assessment of TTS signals are introduced, reviewed and tested. Finally, the requirements for the integration of an instrumental quality measure into a concatenative TTS system are examined.Springeroai:cds.cern.ch:22621902017
spellingShingle Engineering
Hinterleitner, Florian
Quality of synthetic speech: perceptual dimensions, influencing factors, and instrumental assessment
title Quality of synthetic speech: perceptual dimensions, influencing factors, and instrumental assessment
title_full Quality of synthetic speech: perceptual dimensions, influencing factors, and instrumental assessment
title_fullStr Quality of synthetic speech: perceptual dimensions, influencing factors, and instrumental assessment
title_full_unstemmed Quality of synthetic speech: perceptual dimensions, influencing factors, and instrumental assessment
title_short Quality of synthetic speech: perceptual dimensions, influencing factors, and instrumental assessment
title_sort quality of synthetic speech: perceptual dimensions, influencing factors, and instrumental assessment
topic Engineering
url https://dx.doi.org/10.1007/978-981-10-3734-4
http://cds.cern.ch/record/2262190
work_keys_str_mv AT hinterleitnerflorian qualityofsyntheticspeechperceptualdimensionsinfluencingfactorsandinstrumentalassessment