Cargando…

A preliminary study on improving the recognition of esophageal speech using a hybrid system based on statistical voice conversion

In this paper, we propose a hybrid system based on a modified statistical GMM voice conversion algorithm for improving the recognition of esophageal speech. This hybrid system aims to compensate for the distorted information present in the esophageal acoustic features by using a voice conversion met...

Descripción completa

Detalles Bibliográficos
Autores principales: Lachhab, Othman, Di Martino, Joseph, Elhaj, Elhassane Ibn, Hammouch, Ahmed
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer International Publishing 2015
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4627987/
https://www.ncbi.nlm.nih.gov/pubmed/26543778
http://dx.doi.org/10.1186/s40064-015-1428-2
_version_ 1782398358887858176
author Lachhab, Othman
Di Martino, Joseph
Elhaj, Elhassane Ibn
Hammouch, Ahmed
author_facet Lachhab, Othman
Di Martino, Joseph
Elhaj, Elhassane Ibn
Hammouch, Ahmed
author_sort Lachhab, Othman
collection PubMed
description In this paper, we propose a hybrid system based on a modified statistical GMM voice conversion algorithm for improving the recognition of esophageal speech. This hybrid system aims to compensate for the distorted information present in the esophageal acoustic features by using a voice conversion method. The esophageal speech is converted into a “target” laryngeal speech using an iterative statistical estimation of a transformation function. We did not apply a speech synthesizer for reconstructing the converted speech signal, given that the converted Mel cepstral vectors are used directly as input of our speech recognition system. Furthermore the feature vectors are linearly transformed by the HLDA (heteroscedastic linear discriminant analysis) method to reduce their size in a smaller space having good discriminative properties. The experimental results demonstrate that our proposed system provides an improvement of the phone recognition accuracy with an absolute increase of 3.40 % when compared with the phone recognition accuracy obtained with neither HLDA nor voice conversion.
format Online
Article
Text
id pubmed-4627987
institution National Center for Biotechnology Information
language English
publishDate 2015
publisher Springer International Publishing
record_format MEDLINE/PubMed
spelling pubmed-46279872015-11-05 A preliminary study on improving the recognition of esophageal speech using a hybrid system based on statistical voice conversion Lachhab, Othman Di Martino, Joseph Elhaj, Elhassane Ibn Hammouch, Ahmed Springerplus Research In this paper, we propose a hybrid system based on a modified statistical GMM voice conversion algorithm for improving the recognition of esophageal speech. This hybrid system aims to compensate for the distorted information present in the esophageal acoustic features by using a voice conversion method. The esophageal speech is converted into a “target” laryngeal speech using an iterative statistical estimation of a transformation function. We did not apply a speech synthesizer for reconstructing the converted speech signal, given that the converted Mel cepstral vectors are used directly as input of our speech recognition system. Furthermore the feature vectors are linearly transformed by the HLDA (heteroscedastic linear discriminant analysis) method to reduce their size in a smaller space having good discriminative properties. The experimental results demonstrate that our proposed system provides an improvement of the phone recognition accuracy with an absolute increase of 3.40 % when compared with the phone recognition accuracy obtained with neither HLDA nor voice conversion. Springer International Publishing 2015-10-26 /pmc/articles/PMC4627987/ /pubmed/26543778 http://dx.doi.org/10.1186/s40064-015-1428-2 Text en © Lachhab et al. 2015 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
spellingShingle Research
Lachhab, Othman
Di Martino, Joseph
Elhaj, Elhassane Ibn
Hammouch, Ahmed
A preliminary study on improving the recognition of esophageal speech using a hybrid system based on statistical voice conversion
title A preliminary study on improving the recognition of esophageal speech using a hybrid system based on statistical voice conversion
title_full A preliminary study on improving the recognition of esophageal speech using a hybrid system based on statistical voice conversion
title_fullStr A preliminary study on improving the recognition of esophageal speech using a hybrid system based on statistical voice conversion
title_full_unstemmed A preliminary study on improving the recognition of esophageal speech using a hybrid system based on statistical voice conversion
title_short A preliminary study on improving the recognition of esophageal speech using a hybrid system based on statistical voice conversion
title_sort preliminary study on improving the recognition of esophageal speech using a hybrid system based on statistical voice conversion
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4627987/
https://www.ncbi.nlm.nih.gov/pubmed/26543778
http://dx.doi.org/10.1186/s40064-015-1428-2
work_keys_str_mv AT lachhabothman apreliminarystudyonimprovingtherecognitionofesophagealspeechusingahybridsystembasedonstatisticalvoiceconversion
AT dimartinojoseph apreliminarystudyonimprovingtherecognitionofesophagealspeechusingahybridsystembasedonstatisticalvoiceconversion
AT elhajelhassaneibn apreliminarystudyonimprovingtherecognitionofesophagealspeechusingahybridsystembasedonstatisticalvoiceconversion
AT hammouchahmed apreliminarystudyonimprovingtherecognitionofesophagealspeechusingahybridsystembasedonstatisticalvoiceconversion
AT lachhabothman preliminarystudyonimprovingtherecognitionofesophagealspeechusingahybridsystembasedonstatisticalvoiceconversion
AT dimartinojoseph preliminarystudyonimprovingtherecognitionofesophagealspeechusingahybridsystembasedonstatisticalvoiceconversion
AT elhajelhassaneibn preliminarystudyonimprovingtherecognitionofesophagealspeechusingahybridsystembasedonstatisticalvoiceconversion
AT hammouchahmed preliminarystudyonimprovingtherecognitionofesophagealspeechusingahybridsystembasedonstatisticalvoiceconversion