Cargando…

Optimising Speaker-Dependent Feature Extraction Parameters to Improve Automatic Speech Recognition Performance for People with Dysarthria

Within the field of Automatic Speech Recognition (ASR) systems, facing impaired speech is a big challenge because standard approaches are ineffective in the presence of dysarthria. The first aim of our work is to confirm the effectiveness of a new speech analysis technique for speakers with dysarthr...

Descripción completa

Detalles Bibliográficos
Autores principales: Marini, Marco, Vanello, Nicola, Fanucci, Luca
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8512569/
https://www.ncbi.nlm.nih.gov/pubmed/34640780
http://dx.doi.org/10.3390/s21196460
_version_ 1784583025050255360
author Marini, Marco
Vanello, Nicola
Fanucci, Luca
author_facet Marini, Marco
Vanello, Nicola
Fanucci, Luca
author_sort Marini, Marco
collection PubMed
description Within the field of Automatic Speech Recognition (ASR) systems, facing impaired speech is a big challenge because standard approaches are ineffective in the presence of dysarthria. The first aim of our work is to confirm the effectiveness of a new speech analysis technique for speakers with dysarthria. This new approach exploits the fine-tuning of the size and shift parameters of the spectral analysis window used to compute the initial short-time Fourier transform, to improve the performance of a speaker-dependent ASR system. The second aim is to define if there exists a correlation among the speaker’s voice features and the optimal window and shift parameters that minimises the error of an ASR system, for that specific speaker. For our experiments, we used both impaired and unimpaired Italian speech. Specifically, we used 30 speakers with dysarthria from the IDEA database and 10 professional speakers from the CLIPS database. Both databases are freely available. The results confirm that, if a standard ASR system performs poorly with a speaker with dysarthria, it can be improved by using the new speech analysis. Otherwise, the new approach is ineffective in cases of unimpaired and low impaired speech. Furthermore, there exists a correlation between some speaker’s voice features and their optimal parameters.
format Online
Article
Text
id pubmed-8512569
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-85125692021-10-14 Optimising Speaker-Dependent Feature Extraction Parameters to Improve Automatic Speech Recognition Performance for People with Dysarthria Marini, Marco Vanello, Nicola Fanucci, Luca Sensors (Basel) Article Within the field of Automatic Speech Recognition (ASR) systems, facing impaired speech is a big challenge because standard approaches are ineffective in the presence of dysarthria. The first aim of our work is to confirm the effectiveness of a new speech analysis technique for speakers with dysarthria. This new approach exploits the fine-tuning of the size and shift parameters of the spectral analysis window used to compute the initial short-time Fourier transform, to improve the performance of a speaker-dependent ASR system. The second aim is to define if there exists a correlation among the speaker’s voice features and the optimal window and shift parameters that minimises the error of an ASR system, for that specific speaker. For our experiments, we used both impaired and unimpaired Italian speech. Specifically, we used 30 speakers with dysarthria from the IDEA database and 10 professional speakers from the CLIPS database. Both databases are freely available. The results confirm that, if a standard ASR system performs poorly with a speaker with dysarthria, it can be improved by using the new speech analysis. Otherwise, the new approach is ineffective in cases of unimpaired and low impaired speech. Furthermore, there exists a correlation between some speaker’s voice features and their optimal parameters. MDPI 2021-09-27 /pmc/articles/PMC8512569/ /pubmed/34640780 http://dx.doi.org/10.3390/s21196460 Text en © 2021 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Marini, Marco
Vanello, Nicola
Fanucci, Luca
Optimising Speaker-Dependent Feature Extraction Parameters to Improve Automatic Speech Recognition Performance for People with Dysarthria
title Optimising Speaker-Dependent Feature Extraction Parameters to Improve Automatic Speech Recognition Performance for People with Dysarthria
title_full Optimising Speaker-Dependent Feature Extraction Parameters to Improve Automatic Speech Recognition Performance for People with Dysarthria
title_fullStr Optimising Speaker-Dependent Feature Extraction Parameters to Improve Automatic Speech Recognition Performance for People with Dysarthria
title_full_unstemmed Optimising Speaker-Dependent Feature Extraction Parameters to Improve Automatic Speech Recognition Performance for People with Dysarthria
title_short Optimising Speaker-Dependent Feature Extraction Parameters to Improve Automatic Speech Recognition Performance for People with Dysarthria
title_sort optimising speaker-dependent feature extraction parameters to improve automatic speech recognition performance for people with dysarthria
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8512569/
https://www.ncbi.nlm.nih.gov/pubmed/34640780
http://dx.doi.org/10.3390/s21196460
work_keys_str_mv AT marinimarco optimisingspeakerdependentfeatureextractionparameterstoimproveautomaticspeechrecognitionperformanceforpeoplewithdysarthria
AT vanellonicola optimisingspeakerdependentfeatureextractionparameterstoimproveautomaticspeechrecognitionperformanceforpeoplewithdysarthria
AT fanucciluca optimisingspeakerdependentfeatureextractionparameterstoimproveautomaticspeechrecognitionperformanceforpeoplewithdysarthria