Cargando…

Speech reconstruction using a deep partially supervised neural network

Statistical speech reconstruction for larynx-related dysphonia has achieved good performance using Gaussian mixture models and, more recently, restricted Boltzmann machine arrays; however, deep neural network (DNN)-based systems have been hampered by the limited amount of training data available fro...

Descripción completa

Detalles Bibliográficos
Autores principales: McLoughlin, Ian, Li, Jingjie, Song, Yan, Sharifzadeh, Hamid R.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: The Institution of Engineering and Technology 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5569940/
https://www.ncbi.nlm.nih.gov/pubmed/28868149
http://dx.doi.org/10.1049/htl.2016.0103
_version_ 1783259083943444480
author McLoughlin, Ian
Li, Jingjie
Song, Yan
Sharifzadeh, Hamid R.
author_facet McLoughlin, Ian
Li, Jingjie
Song, Yan
Sharifzadeh, Hamid R.
author_sort McLoughlin, Ian
collection PubMed
description Statistical speech reconstruction for larynx-related dysphonia has achieved good performance using Gaussian mixture models and, more recently, restricted Boltzmann machine arrays; however, deep neural network (DNN)-based systems have been hampered by the limited amount of training data available from individual voice-loss patients. The authors propose a novel DNN structure that allows a partially supervised training approach on spectral features from smaller data sets, yielding very good results compared with the current state-of-the-art.
format Online
Article
Text
id pubmed-5569940
institution National Center for Biotechnology Information
language English
publishDate 2017
publisher The Institution of Engineering and Technology
record_format MEDLINE/PubMed
spelling pubmed-55699402017-09-01 Speech reconstruction using a deep partially supervised neural network McLoughlin, Ian Li, Jingjie Song, Yan Sharifzadeh, Hamid R. Healthc Technol Lett Article Statistical speech reconstruction for larynx-related dysphonia has achieved good performance using Gaussian mixture models and, more recently, restricted Boltzmann machine arrays; however, deep neural network (DNN)-based systems have been hampered by the limited amount of training data available from individual voice-loss patients. The authors propose a novel DNN structure that allows a partially supervised training approach on spectral features from smaller data sets, yielding very good results compared with the current state-of-the-art. The Institution of Engineering and Technology 2017-06-09 /pmc/articles/PMC5569940/ /pubmed/28868149 http://dx.doi.org/10.1049/htl.2016.0103 Text en http://creativecommons.org/licenses/by-nc/3.0/ This is an open access article published by the IET under the Creative Commons Attribution -NonCommercial License (http://creativecommons.org/licenses/by-nc/3.0/)
spellingShingle Article
McLoughlin, Ian
Li, Jingjie
Song, Yan
Sharifzadeh, Hamid R.
Speech reconstruction using a deep partially supervised neural network
title Speech reconstruction using a deep partially supervised neural network
title_full Speech reconstruction using a deep partially supervised neural network
title_fullStr Speech reconstruction using a deep partially supervised neural network
title_full_unstemmed Speech reconstruction using a deep partially supervised neural network
title_short Speech reconstruction using a deep partially supervised neural network
title_sort speech reconstruction using a deep partially supervised neural network
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5569940/
https://www.ncbi.nlm.nih.gov/pubmed/28868149
http://dx.doi.org/10.1049/htl.2016.0103
work_keys_str_mv AT mcloughlinian speechreconstructionusingadeeppartiallysupervisedneuralnetwork
AT lijingjie speechreconstructionusingadeeppartiallysupervisedneuralnetwork
AT songyan speechreconstructionusingadeeppartiallysupervisedneuralnetwork
AT sharifzadehhamidr speechreconstructionusingadeeppartiallysupervisedneuralnetwork