Cargando…
Three Factors Are Critical in Order to Synthesize Intelligible Noise-Vocoded Japanese Speech
Factor analysis (principal component analysis followed by varimax rotation) had shown that 3 common factors appear across 20 critical-band power fluctuations derived from spoken sentences of eight different languages [Ueda et al. (2010). Fechner Day 2010, Padua]. The present study investigated the c...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2016
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4845253/ https://www.ncbi.nlm.nih.gov/pubmed/27199790 http://dx.doi.org/10.3389/fpsyg.2016.00517 |
_version_ | 1782428904764473344 |
---|---|
author | Kishida, Takuya Nakajima, Yoshitaka Ueda, Kazuo Remijn, Gerard B. |
author_facet | Kishida, Takuya Nakajima, Yoshitaka Ueda, Kazuo Remijn, Gerard B. |
author_sort | Kishida, Takuya |
collection | PubMed |
description | Factor analysis (principal component analysis followed by varimax rotation) had shown that 3 common factors appear across 20 critical-band power fluctuations derived from spoken sentences of eight different languages [Ueda et al. (2010). Fechner Day 2010, Padua]. The present study investigated the contributions of such power-fluctuation factors to speech intelligibility. The method of factor analysis was modified to obtain factors suitable for resynthesizing speech sounds as 20-critical-band noise-vocoded speech. The resynthesized speech sounds were used for an intelligibility test. The modification of factor analysis ensured that the resynthesized speech sounds were not accompanied by a steady background noise caused by the data reduction procedure. Spoken sentences of British English, Japanese, and Mandarin Chinese were subjected to this modified analysis. Confirming the earlier analysis, indeed 3–4 factors were common to these languages. The number of power-fluctuation factors needed to make noise-vocoded speech intelligible was then examined. Critical-band power fluctuations of the Japanese spoken sentences were resynthesized from the obtained factors, resulting in noise-vocoded-speech stimuli, and the intelligibility of these speech stimuli was tested by 12 native Japanese speakers. Japanese mora (syllable-like phonological unit) identification performances were measured when the number of factors was 1–9. Statistically significant improvement in intelligibility was observed when the number of factors was increased stepwise up to 6. The 12 listeners identified 92.1% of the morae correctly on average in the 6-factor condition. The intelligibility improved sharply when the number of factors changed from 2 to 3. In this step, the cumulative contribution ratio of factors improved only by 10.6%, from 37.3 to 47.9%, but the average mora identification leaped from 6.9 to 69.2%. The results indicated that, if the number of factors is 3 or more, elementary linguistic information is preserved in such noise-vocoded speech. |
format | Online Article Text |
id | pubmed-4845253 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2016 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-48452532016-05-19 Three Factors Are Critical in Order to Synthesize Intelligible Noise-Vocoded Japanese Speech Kishida, Takuya Nakajima, Yoshitaka Ueda, Kazuo Remijn, Gerard B. Front Psychol Psychology Factor analysis (principal component analysis followed by varimax rotation) had shown that 3 common factors appear across 20 critical-band power fluctuations derived from spoken sentences of eight different languages [Ueda et al. (2010). Fechner Day 2010, Padua]. The present study investigated the contributions of such power-fluctuation factors to speech intelligibility. The method of factor analysis was modified to obtain factors suitable for resynthesizing speech sounds as 20-critical-band noise-vocoded speech. The resynthesized speech sounds were used for an intelligibility test. The modification of factor analysis ensured that the resynthesized speech sounds were not accompanied by a steady background noise caused by the data reduction procedure. Spoken sentences of British English, Japanese, and Mandarin Chinese were subjected to this modified analysis. Confirming the earlier analysis, indeed 3–4 factors were common to these languages. The number of power-fluctuation factors needed to make noise-vocoded speech intelligible was then examined. Critical-band power fluctuations of the Japanese spoken sentences were resynthesized from the obtained factors, resulting in noise-vocoded-speech stimuli, and the intelligibility of these speech stimuli was tested by 12 native Japanese speakers. Japanese mora (syllable-like phonological unit) identification performances were measured when the number of factors was 1–9. Statistically significant improvement in intelligibility was observed when the number of factors was increased stepwise up to 6. The 12 listeners identified 92.1% of the morae correctly on average in the 6-factor condition. The intelligibility improved sharply when the number of factors changed from 2 to 3. In this step, the cumulative contribution ratio of factors improved only by 10.6%, from 37.3 to 47.9%, but the average mora identification leaped from 6.9 to 69.2%. The results indicated that, if the number of factors is 3 or more, elementary linguistic information is preserved in such noise-vocoded speech. Frontiers Media S.A. 2016-04-26 /pmc/articles/PMC4845253/ /pubmed/27199790 http://dx.doi.org/10.3389/fpsyg.2016.00517 Text en Copyright © 2016 Kishida, Nakajima, Ueda and Remijn. http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Psychology Kishida, Takuya Nakajima, Yoshitaka Ueda, Kazuo Remijn, Gerard B. Three Factors Are Critical in Order to Synthesize Intelligible Noise-Vocoded Japanese Speech |
title | Three Factors Are Critical in Order to Synthesize Intelligible Noise-Vocoded Japanese Speech |
title_full | Three Factors Are Critical in Order to Synthesize Intelligible Noise-Vocoded Japanese Speech |
title_fullStr | Three Factors Are Critical in Order to Synthesize Intelligible Noise-Vocoded Japanese Speech |
title_full_unstemmed | Three Factors Are Critical in Order to Synthesize Intelligible Noise-Vocoded Japanese Speech |
title_short | Three Factors Are Critical in Order to Synthesize Intelligible Noise-Vocoded Japanese Speech |
title_sort | three factors are critical in order to synthesize intelligible noise-vocoded japanese speech |
topic | Psychology |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4845253/ https://www.ncbi.nlm.nih.gov/pubmed/27199790 http://dx.doi.org/10.3389/fpsyg.2016.00517 |
work_keys_str_mv | AT kishidatakuya threefactorsarecriticalinordertosynthesizeintelligiblenoisevocodedjapanesespeech AT nakajimayoshitaka threefactorsarecriticalinordertosynthesizeintelligiblenoisevocodedjapanesespeech AT uedakazuo threefactorsarecriticalinordertosynthesizeintelligiblenoisevocodedjapanesespeech AT remijngerardb threefactorsarecriticalinordertosynthesizeintelligiblenoisevocodedjapanesespeech |