Cargando…

Three Factors Are Critical in Order to Synthesize Intelligible Noise-Vocoded Japanese Speech

Factor analysis (principal component analysis followed by varimax rotation) had shown that 3 common factors appear across 20 critical-band power fluctuations derived from spoken sentences of eight different languages [Ueda et al. (2010). Fechner Day 2010, Padua]. The present study investigated the c...

Descripción completa

Detalles Bibliográficos
Autores principales: Kishida, Takuya, Nakajima, Yoshitaka, Ueda, Kazuo, Remijn, Gerard B.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4845253/
https://www.ncbi.nlm.nih.gov/pubmed/27199790
http://dx.doi.org/10.3389/fpsyg.2016.00517
_version_ 1782428904764473344
author Kishida, Takuya
Nakajima, Yoshitaka
Ueda, Kazuo
Remijn, Gerard B.
author_facet Kishida, Takuya
Nakajima, Yoshitaka
Ueda, Kazuo
Remijn, Gerard B.
author_sort Kishida, Takuya
collection PubMed
description Factor analysis (principal component analysis followed by varimax rotation) had shown that 3 common factors appear across 20 critical-band power fluctuations derived from spoken sentences of eight different languages [Ueda et al. (2010). Fechner Day 2010, Padua]. The present study investigated the contributions of such power-fluctuation factors to speech intelligibility. The method of factor analysis was modified to obtain factors suitable for resynthesizing speech sounds as 20-critical-band noise-vocoded speech. The resynthesized speech sounds were used for an intelligibility test. The modification of factor analysis ensured that the resynthesized speech sounds were not accompanied by a steady background noise caused by the data reduction procedure. Spoken sentences of British English, Japanese, and Mandarin Chinese were subjected to this modified analysis. Confirming the earlier analysis, indeed 3–4 factors were common to these languages. The number of power-fluctuation factors needed to make noise-vocoded speech intelligible was then examined. Critical-band power fluctuations of the Japanese spoken sentences were resynthesized from the obtained factors, resulting in noise-vocoded-speech stimuli, and the intelligibility of these speech stimuli was tested by 12 native Japanese speakers. Japanese mora (syllable-like phonological unit) identification performances were measured when the number of factors was 1–9. Statistically significant improvement in intelligibility was observed when the number of factors was increased stepwise up to 6. The 12 listeners identified 92.1% of the morae correctly on average in the 6-factor condition. The intelligibility improved sharply when the number of factors changed from 2 to 3. In this step, the cumulative contribution ratio of factors improved only by 10.6%, from 37.3 to 47.9%, but the average mora identification leaped from 6.9 to 69.2%. The results indicated that, if the number of factors is 3 or more, elementary linguistic information is preserved in such noise-vocoded speech.
format Online
Article
Text
id pubmed-4845253
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-48452532016-05-19 Three Factors Are Critical in Order to Synthesize Intelligible Noise-Vocoded Japanese Speech Kishida, Takuya Nakajima, Yoshitaka Ueda, Kazuo Remijn, Gerard B. Front Psychol Psychology Factor analysis (principal component analysis followed by varimax rotation) had shown that 3 common factors appear across 20 critical-band power fluctuations derived from spoken sentences of eight different languages [Ueda et al. (2010). Fechner Day 2010, Padua]. The present study investigated the contributions of such power-fluctuation factors to speech intelligibility. The method of factor analysis was modified to obtain factors suitable for resynthesizing speech sounds as 20-critical-band noise-vocoded speech. The resynthesized speech sounds were used for an intelligibility test. The modification of factor analysis ensured that the resynthesized speech sounds were not accompanied by a steady background noise caused by the data reduction procedure. Spoken sentences of British English, Japanese, and Mandarin Chinese were subjected to this modified analysis. Confirming the earlier analysis, indeed 3–4 factors were common to these languages. The number of power-fluctuation factors needed to make noise-vocoded speech intelligible was then examined. Critical-band power fluctuations of the Japanese spoken sentences were resynthesized from the obtained factors, resulting in noise-vocoded-speech stimuli, and the intelligibility of these speech stimuli was tested by 12 native Japanese speakers. Japanese mora (syllable-like phonological unit) identification performances were measured when the number of factors was 1–9. Statistically significant improvement in intelligibility was observed when the number of factors was increased stepwise up to 6. The 12 listeners identified 92.1% of the morae correctly on average in the 6-factor condition. The intelligibility improved sharply when the number of factors changed from 2 to 3. In this step, the cumulative contribution ratio of factors improved only by 10.6%, from 37.3 to 47.9%, but the average mora identification leaped from 6.9 to 69.2%. The results indicated that, if the number of factors is 3 or more, elementary linguistic information is preserved in such noise-vocoded speech. Frontiers Media S.A. 2016-04-26 /pmc/articles/PMC4845253/ /pubmed/27199790 http://dx.doi.org/10.3389/fpsyg.2016.00517 Text en Copyright © 2016 Kishida, Nakajima, Ueda and Remijn. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Psychology
Kishida, Takuya
Nakajima, Yoshitaka
Ueda, Kazuo
Remijn, Gerard B.
Three Factors Are Critical in Order to Synthesize Intelligible Noise-Vocoded Japanese Speech
title Three Factors Are Critical in Order to Synthesize Intelligible Noise-Vocoded Japanese Speech
title_full Three Factors Are Critical in Order to Synthesize Intelligible Noise-Vocoded Japanese Speech
title_fullStr Three Factors Are Critical in Order to Synthesize Intelligible Noise-Vocoded Japanese Speech
title_full_unstemmed Three Factors Are Critical in Order to Synthesize Intelligible Noise-Vocoded Japanese Speech
title_short Three Factors Are Critical in Order to Synthesize Intelligible Noise-Vocoded Japanese Speech
title_sort three factors are critical in order to synthesize intelligible noise-vocoded japanese speech
topic Psychology
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4845253/
https://www.ncbi.nlm.nih.gov/pubmed/27199790
http://dx.doi.org/10.3389/fpsyg.2016.00517
work_keys_str_mv AT kishidatakuya threefactorsarecriticalinordertosynthesizeintelligiblenoisevocodedjapanesespeech
AT nakajimayoshitaka threefactorsarecriticalinordertosynthesizeintelligiblenoisevocodedjapanesespeech
AT uedakazuo threefactorsarecriticalinordertosynthesizeintelligiblenoisevocodedjapanesespeech
AT remijngerardb threefactorsarecriticalinordertosynthesizeintelligiblenoisevocodedjapanesespeech