Cargando…

Severity-Based Adaptation with Limited Data for ASR to Aid Dysarthric Speakers

Automatic speech recognition (ASR) is currently used in many assistive technologies, such as helping individuals with speech impairment in their communication ability. One challenge in ASR for speech-impaired individuals is the difficulty in obtaining a good speech database of impaired speakers for...

Descripción completa

Detalles Bibliográficos
Autores principales: Mustafa, Mumtaz Begum, Salim, Siti Salwah, Mohamed, Noraini, Al-Qatab, Bassam, Siong, Chng Eng
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3900508/
https://www.ncbi.nlm.nih.gov/pubmed/24466004
http://dx.doi.org/10.1371/journal.pone.0086285
_version_ 1782300707137781760
author Mustafa, Mumtaz Begum
Salim, Siti Salwah
Mohamed, Noraini
Al-Qatab, Bassam
Siong, Chng Eng
author_facet Mustafa, Mumtaz Begum
Salim, Siti Salwah
Mohamed, Noraini
Al-Qatab, Bassam
Siong, Chng Eng
author_sort Mustafa, Mumtaz Begum
collection PubMed
description Automatic speech recognition (ASR) is currently used in many assistive technologies, such as helping individuals with speech impairment in their communication ability. One challenge in ASR for speech-impaired individuals is the difficulty in obtaining a good speech database of impaired speakers for building an effective speech acoustic model. Because there are very few existing databases of impaired speech, which are also limited in size, the obvious solution to build a speech acoustic model of impaired speech is by employing adaptation techniques. However, issues that have not been addressed in existing studies in the area of adaptation for speech impairment are as follows: (1) identifying the most effective adaptation technique for impaired speech; and (2) the use of suitable source models to build an effective impaired-speech acoustic model. This research investigates the above-mentioned two issues on dysarthria, a type of speech impairment affecting millions of people. We applied both unimpaired and impaired speech as the source model with well-known adaptation techniques like the maximum likelihood linear regression (MLLR) and the constrained-MLLR(C-MLLR). The recognition accuracy of each impaired speech acoustic model is measured in terms of word error rate (WER), with further assessments, including phoneme insertion, substitution and deletion rates. Unimpaired speech when combined with limited high-quality speech-impaired data improves performance of ASR systems in recognising severely impaired dysarthric speech. The C-MLLR adaptation technique was also found to be better than MLLR in recognising mildly and moderately impaired speech based on the statistical analysis of the WER. It was found that phoneme substitution was the biggest contributing factor in WER in dysarthric speech for all levels of severity. The results show that the speech acoustic models derived from suitable adaptation techniques improve the performance of ASR systems in recognising impaired speech with limited adaptation data.
format Online
Article
Text
id pubmed-3900508
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-39005082014-01-24 Severity-Based Adaptation with Limited Data for ASR to Aid Dysarthric Speakers Mustafa, Mumtaz Begum Salim, Siti Salwah Mohamed, Noraini Al-Qatab, Bassam Siong, Chng Eng PLoS One Research Article Automatic speech recognition (ASR) is currently used in many assistive technologies, such as helping individuals with speech impairment in their communication ability. One challenge in ASR for speech-impaired individuals is the difficulty in obtaining a good speech database of impaired speakers for building an effective speech acoustic model. Because there are very few existing databases of impaired speech, which are also limited in size, the obvious solution to build a speech acoustic model of impaired speech is by employing adaptation techniques. However, issues that have not been addressed in existing studies in the area of adaptation for speech impairment are as follows: (1) identifying the most effective adaptation technique for impaired speech; and (2) the use of suitable source models to build an effective impaired-speech acoustic model. This research investigates the above-mentioned two issues on dysarthria, a type of speech impairment affecting millions of people. We applied both unimpaired and impaired speech as the source model with well-known adaptation techniques like the maximum likelihood linear regression (MLLR) and the constrained-MLLR(C-MLLR). The recognition accuracy of each impaired speech acoustic model is measured in terms of word error rate (WER), with further assessments, including phoneme insertion, substitution and deletion rates. Unimpaired speech when combined with limited high-quality speech-impaired data improves performance of ASR systems in recognising severely impaired dysarthric speech. The C-MLLR adaptation technique was also found to be better than MLLR in recognising mildly and moderately impaired speech based on the statistical analysis of the WER. It was found that phoneme substitution was the biggest contributing factor in WER in dysarthric speech for all levels of severity. The results show that the speech acoustic models derived from suitable adaptation techniques improve the performance of ASR systems in recognising impaired speech with limited adaptation data. Public Library of Science 2014-01-23 /pmc/articles/PMC3900508/ /pubmed/24466004 http://dx.doi.org/10.1371/journal.pone.0086285 Text en © 2014 Mustafa et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Mustafa, Mumtaz Begum
Salim, Siti Salwah
Mohamed, Noraini
Al-Qatab, Bassam
Siong, Chng Eng
Severity-Based Adaptation with Limited Data for ASR to Aid Dysarthric Speakers
title Severity-Based Adaptation with Limited Data for ASR to Aid Dysarthric Speakers
title_full Severity-Based Adaptation with Limited Data for ASR to Aid Dysarthric Speakers
title_fullStr Severity-Based Adaptation with Limited Data for ASR to Aid Dysarthric Speakers
title_full_unstemmed Severity-Based Adaptation with Limited Data for ASR to Aid Dysarthric Speakers
title_short Severity-Based Adaptation with Limited Data for ASR to Aid Dysarthric Speakers
title_sort severity-based adaptation with limited data for asr to aid dysarthric speakers
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3900508/
https://www.ncbi.nlm.nih.gov/pubmed/24466004
http://dx.doi.org/10.1371/journal.pone.0086285
work_keys_str_mv AT mustafamumtazbegum severitybasedadaptationwithlimiteddataforasrtoaiddysarthricspeakers
AT salimsitisalwah severitybasedadaptationwithlimiteddataforasrtoaiddysarthricspeakers
AT mohamednoraini severitybasedadaptationwithlimiteddataforasrtoaiddysarthricspeakers
AT alqatabbassam severitybasedadaptationwithlimiteddataforasrtoaiddysarthricspeakers
AT siongchngeng severitybasedadaptationwithlimiteddataforasrtoaiddysarthricspeakers