Cargando…

Quality prediction of synthesized speech based on tensor structured EEG signals

This study investigates quality prediction methods for synthesized speech using EEG. Training a predictive model using EEG is challenging due to a small number of training trials, a low signal-to-noise ratio, and a high correlation among independent variables. When a predictive model is trained with...

Descripción completa

Detalles Bibliográficos
Autores principales:	Maki, Hayato, Sakti, Sakriani, Tanaka, Hiroki, Nakamura, Satoshi
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Public Library of Science 2018
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6002021/ https://www.ncbi.nlm.nih.gov/pubmed/29902169 http://dx.doi.org/10.1371/journal.pone.0193521

_version_	1783332124680519680
author	Maki, Hayato Sakti, Sakriani Tanaka, Hiroki Nakamura, Satoshi
author_facet	Maki, Hayato Sakti, Sakriani Tanaka, Hiroki Nakamura, Satoshi
author_sort	Maki, Hayato
collection	PubMed
description	This study investigates quality prediction methods for synthesized speech using EEG. Training a predictive model using EEG is challenging due to a small number of training trials, a low signal-to-noise ratio, and a high correlation among independent variables. When a predictive model is trained with a machine learning algorithm, the features extracted from multi-channel EEG signals are usually organized as a vector and their structures are ignored even though they are highly structured signals. This study predicts the subjective rating scores of synthesized speeches, including their overall impression, valence, and arousal, by creating tensor structured features instead of vectorized ones to exploit the structure of the features. We extracted various features to construct a tensor feature that maintained their structure. Vectorized and tensorial features were used to predict the rating scales, and the experimental result showed that prediction with tensorial features achieved the better predictive performance. Among the features, the alpha and beta bands are particularly more effective for predictions than other features, which agrees with previous neurophysiological studies.
format	Online Article Text
id	pubmed-6002021
institution	National Center for Biotechnology Information
language	English
publishDate	2018
publisher	Public Library of Science
record_format	MEDLINE/PubMed
spelling	pubmed-60020212018-06-25 Quality prediction of synthesized speech based on tensor structured EEG signals Maki, Hayato Sakti, Sakriani Tanaka, Hiroki Nakamura, Satoshi PLoS One Research Article This study investigates quality prediction methods for synthesized speech using EEG. Training a predictive model using EEG is challenging due to a small number of training trials, a low signal-to-noise ratio, and a high correlation among independent variables. When a predictive model is trained with a machine learning algorithm, the features extracted from multi-channel EEG signals are usually organized as a vector and their structures are ignored even though they are highly structured signals. This study predicts the subjective rating scores of synthesized speeches, including their overall impression, valence, and arousal, by creating tensor structured features instead of vectorized ones to exploit the structure of the features. We extracted various features to construct a tensor feature that maintained their structure. Vectorized and tensorial features were used to predict the rating scales, and the experimental result showed that prediction with tensorial features achieved the better predictive performance. Among the features, the alpha and beta bands are particularly more effective for predictions than other features, which agrees with previous neurophysiological studies. Public Library of Science 2018-06-14 /pmc/articles/PMC6002021/ /pubmed/29902169 http://dx.doi.org/10.1371/journal.pone.0193521 Text en © 2018 Maki et al http://creativecommons.org/licenses/by/4.0/ This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle	Research Article Maki, Hayato Sakti, Sakriani Tanaka, Hiroki Nakamura, Satoshi Quality prediction of synthesized speech based on tensor structured EEG signals
title	Quality prediction of synthesized speech based on tensor structured EEG signals
title_full	Quality prediction of synthesized speech based on tensor structured EEG signals
title_fullStr	Quality prediction of synthesized speech based on tensor structured EEG signals
title_full_unstemmed	Quality prediction of synthesized speech based on tensor structured EEG signals
title_short	Quality prediction of synthesized speech based on tensor structured EEG signals
title_sort	quality prediction of synthesized speech based on tensor structured eeg signals
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6002021/ https://www.ncbi.nlm.nih.gov/pubmed/29902169 http://dx.doi.org/10.1371/journal.pone.0193521
work_keys_str_mv	AT makihayato qualitypredictionofsynthesizedspeechbasedontensorstructuredeegsignals AT saktisakriani qualitypredictionofsynthesizedspeechbasedontensorstructuredeegsignals AT tanakahiroki qualitypredictionofsynthesizedspeechbasedontensorstructuredeegsignals AT nakamurasatoshi qualitypredictionofsynthesizedspeechbasedontensorstructuredeegsignals

Quality prediction of synthesized speech based on tensor structured EEG signals

Ejemplares similares