Cargando…

Intuitive visualizations of pitch and loudness in speech

Visualizing acoustic features of speech has proven helpful in speech therapy; however, it is as yet unclear how to create intuitive and fitting visualizations. To better understand the mappings from speech sound aspects to visual space, a large web-based experiment (n = 249) was performed to evaluat...

Descripción completa

Detalles Bibliográficos
Autores principales: Schaefer, Rebecca S., Beijer, Lilian J., Seuskens, Wiel, Rietveld, Toni C. M., Sadakata, Makiko
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer US 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4828474/
https://www.ncbi.nlm.nih.gov/pubmed/26370217
http://dx.doi.org/10.3758/s13423-015-0934-0
_version_ 1782426583353524224
author Schaefer, Rebecca S.
Beijer, Lilian J.
Seuskens, Wiel
Rietveld, Toni C. M.
Sadakata, Makiko
author_facet Schaefer, Rebecca S.
Beijer, Lilian J.
Seuskens, Wiel
Rietveld, Toni C. M.
Sadakata, Makiko
author_sort Schaefer, Rebecca S.
collection PubMed
description Visualizing acoustic features of speech has proven helpful in speech therapy; however, it is as yet unclear how to create intuitive and fitting visualizations. To better understand the mappings from speech sound aspects to visual space, a large web-based experiment (n = 249) was performed to evaluate spatial parameters that may optimally represent pitch and loudness of speech. To this end, five novel animated visualizations were developed and presented in pairwise comparisons, together with a static visualization. Pitch and loudness of speech were each mapped onto either the vertical (y-axis) or the size (z-axis) dimension, or combined (with size indicating loudness and vertical position indicating pitch height) and visualized as an animation along the horizontal dimension (x-axis) over time. The results indicated that firstly, there is a general preference towards the use of the y-axis for both pitch and loudness, with pitch ranking higher than loudness in terms of fit. Secondly, the data suggest that representing both pitch and loudness combined in a single visualization is preferred over visualization in only one dimension. Finally, the z-axis, although not preferred, was evaluated as corresponding better to loudness than to pitch. This relation between sound and visual space has not been reported previously for speech sounds, and elaborates earlier findings on musical material. In addition to elucidating more general mappings between auditory and visual modalities, the findings provide us with a method of visualizing speech that may be helpful in clinical applications such as computerized speech therapy, or other feedback-based learning paradigms.
format Online
Article
Text
id pubmed-4828474
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Springer US
record_format MEDLINE/PubMed
spelling pubmed-48284742016-04-21 Intuitive visualizations of pitch and loudness in speech Schaefer, Rebecca S. Beijer, Lilian J. Seuskens, Wiel Rietveld, Toni C. M. Sadakata, Makiko Psychon Bull Rev Brief Report Visualizing acoustic features of speech has proven helpful in speech therapy; however, it is as yet unclear how to create intuitive and fitting visualizations. To better understand the mappings from speech sound aspects to visual space, a large web-based experiment (n = 249) was performed to evaluate spatial parameters that may optimally represent pitch and loudness of speech. To this end, five novel animated visualizations were developed and presented in pairwise comparisons, together with a static visualization. Pitch and loudness of speech were each mapped onto either the vertical (y-axis) or the size (z-axis) dimension, or combined (with size indicating loudness and vertical position indicating pitch height) and visualized as an animation along the horizontal dimension (x-axis) over time. The results indicated that firstly, there is a general preference towards the use of the y-axis for both pitch and loudness, with pitch ranking higher than loudness in terms of fit. Secondly, the data suggest that representing both pitch and loudness combined in a single visualization is preferred over visualization in only one dimension. Finally, the z-axis, although not preferred, was evaluated as corresponding better to loudness than to pitch. This relation between sound and visual space has not been reported previously for speech sounds, and elaborates earlier findings on musical material. In addition to elucidating more general mappings between auditory and visual modalities, the findings provide us with a method of visualizing speech that may be helpful in clinical applications such as computerized speech therapy, or other feedback-based learning paradigms. Springer US 2015-09-14 2016 /pmc/articles/PMC4828474/ /pubmed/26370217 http://dx.doi.org/10.3758/s13423-015-0934-0 Text en © The Author(s) 2015 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
spellingShingle Brief Report
Schaefer, Rebecca S.
Beijer, Lilian J.
Seuskens, Wiel
Rietveld, Toni C. M.
Sadakata, Makiko
Intuitive visualizations of pitch and loudness in speech
title Intuitive visualizations of pitch and loudness in speech
title_full Intuitive visualizations of pitch and loudness in speech
title_fullStr Intuitive visualizations of pitch and loudness in speech
title_full_unstemmed Intuitive visualizations of pitch and loudness in speech
title_short Intuitive visualizations of pitch and loudness in speech
title_sort intuitive visualizations of pitch and loudness in speech
topic Brief Report
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4828474/
https://www.ncbi.nlm.nih.gov/pubmed/26370217
http://dx.doi.org/10.3758/s13423-015-0934-0
work_keys_str_mv AT schaeferrebeccas intuitivevisualizationsofpitchandloudnessinspeech
AT beijerlilianj intuitivevisualizationsofpitchandloudnessinspeech
AT seuskenswiel intuitivevisualizationsofpitchandloudnessinspeech
AT rietveldtonicm intuitivevisualizationsofpitchandloudnessinspeech
AT sadakatamakiko intuitivevisualizationsofpitchandloudnessinspeech