Cargando…

Formant analysis in dysphonic patients and automatic Arabic digit speech recognition

BACKGROUND AND OBJECTIVE: There has been a growing interest in objective assessment of speech in dysphonic patients for the classification of the type and severity of voice pathologies using automatic speech recognition (ASR). The aim of this work was to study the accuracy of the conventional ASR sy...

Descripción completa

Detalles Bibliográficos
Autores principales: Muhammad, Ghulam, Mesallam, Tamer A, Malki, Khalid H, Farahat, Mohamed, Alsulaiman, Mansour, Bukhari, Manal
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3120728/
https://www.ncbi.nlm.nih.gov/pubmed/21624137
http://dx.doi.org/10.1186/1475-925X-10-41
_version_ 1782206743850254336
author Muhammad, Ghulam
Mesallam, Tamer A
Malki, Khalid H
Farahat, Mohamed
Alsulaiman, Mansour
Bukhari, Manal
author_facet Muhammad, Ghulam
Mesallam, Tamer A
Malki, Khalid H
Farahat, Mohamed
Alsulaiman, Mansour
Bukhari, Manal
author_sort Muhammad, Ghulam
collection PubMed
description BACKGROUND AND OBJECTIVE: There has been a growing interest in objective assessment of speech in dysphonic patients for the classification of the type and severity of voice pathologies using automatic speech recognition (ASR). The aim of this work was to study the accuracy of the conventional ASR system (with Mel frequency cepstral coefficients (MFCCs) based front end and hidden Markov model (HMM) based back end) in recognizing the speech characteristics of people with pathological voice. MATERIALS AND METHODS: The speech samples of 62 dysphonic patients with six different types of voice disorders and 50 normal subjects were analyzed. The Arabic spoken digits were taken as an input. The distribution of the first four formants of the vowel /a/ was extracted to examine deviation of the formants from normal. RESULTS: There was 100% recognition accuracy obtained for Arabic digits spoken by normal speakers. However, there was a significant loss of accuracy in the classifications while spoken by voice disordered subjects. Moreover, no significant improvement in ASR performance was achieved after assessing a subset of the individuals with disordered voices who underwent treatment. CONCLUSION: The results of this study revealed that the current ASR technique is not a reliable tool in recognizing the speech of dysphonic patients.
format Online
Article
Text
id pubmed-3120728
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-31207282011-06-23 Formant analysis in dysphonic patients and automatic Arabic digit speech recognition Muhammad, Ghulam Mesallam, Tamer A Malki, Khalid H Farahat, Mohamed Alsulaiman, Mansour Bukhari, Manal Biomed Eng Online Research BACKGROUND AND OBJECTIVE: There has been a growing interest in objective assessment of speech in dysphonic patients for the classification of the type and severity of voice pathologies using automatic speech recognition (ASR). The aim of this work was to study the accuracy of the conventional ASR system (with Mel frequency cepstral coefficients (MFCCs) based front end and hidden Markov model (HMM) based back end) in recognizing the speech characteristics of people with pathological voice. MATERIALS AND METHODS: The speech samples of 62 dysphonic patients with six different types of voice disorders and 50 normal subjects were analyzed. The Arabic spoken digits were taken as an input. The distribution of the first four formants of the vowel /a/ was extracted to examine deviation of the formants from normal. RESULTS: There was 100% recognition accuracy obtained for Arabic digits spoken by normal speakers. However, there was a significant loss of accuracy in the classifications while spoken by voice disordered subjects. Moreover, no significant improvement in ASR performance was achieved after assessing a subset of the individuals with disordered voices who underwent treatment. CONCLUSION: The results of this study revealed that the current ASR technique is not a reliable tool in recognizing the speech of dysphonic patients. BioMed Central 2011-05-30 /pmc/articles/PMC3120728/ /pubmed/21624137 http://dx.doi.org/10.1186/1475-925X-10-41 Text en Copyright ©2011 Muhammad et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Muhammad, Ghulam
Mesallam, Tamer A
Malki, Khalid H
Farahat, Mohamed
Alsulaiman, Mansour
Bukhari, Manal
Formant analysis in dysphonic patients and automatic Arabic digit speech recognition
title Formant analysis in dysphonic patients and automatic Arabic digit speech recognition
title_full Formant analysis in dysphonic patients and automatic Arabic digit speech recognition
title_fullStr Formant analysis in dysphonic patients and automatic Arabic digit speech recognition
title_full_unstemmed Formant analysis in dysphonic patients and automatic Arabic digit speech recognition
title_short Formant analysis in dysphonic patients and automatic Arabic digit speech recognition
title_sort formant analysis in dysphonic patients and automatic arabic digit speech recognition
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3120728/
https://www.ncbi.nlm.nih.gov/pubmed/21624137
http://dx.doi.org/10.1186/1475-925X-10-41
work_keys_str_mv AT muhammadghulam formantanalysisindysphonicpatientsandautomaticarabicdigitspeechrecognition
AT mesallamtamera formantanalysisindysphonicpatientsandautomaticarabicdigitspeechrecognition
AT malkikhalidh formantanalysisindysphonicpatientsandautomaticarabicdigitspeechrecognition
AT farahatmohamed formantanalysisindysphonicpatientsandautomaticarabicdigitspeechrecognition
AT alsulaimanmansour formantanalysisindysphonicpatientsandautomaticarabicdigitspeechrecognition
AT bukharimanal formantanalysisindysphonicpatientsandautomaticarabicdigitspeechrecognition