Cargando…
An Analytical Study of Speech Pathology Detection Based on MFCC and Deep Neural Networks
Diseases of internal organs other than the vocal folds can also affect a person's voice. As a result, voice problems are on the rise, even though they are frequently overlooked. According to a recent study, voice pathology detection systems can successfully help the assessment of voice abnormal...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Hindawi
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9071878/ https://www.ncbi.nlm.nih.gov/pubmed/35529259 http://dx.doi.org/10.1155/2022/7814952 |
_version_ | 1784700926605393920 |
---|---|
author | Zakariah, Mohammed B, Reshma Ajmi Alotaibi, Yousef Guo, Yanhui Tran-Trung, Kiet Elahi, Mohammad Mamun |
author_facet | Zakariah, Mohammed B, Reshma Ajmi Alotaibi, Yousef Guo, Yanhui Tran-Trung, Kiet Elahi, Mohammad Mamun |
author_sort | Zakariah, Mohammed |
collection | PubMed |
description | Diseases of internal organs other than the vocal folds can also affect a person's voice. As a result, voice problems are on the rise, even though they are frequently overlooked. According to a recent study, voice pathology detection systems can successfully help the assessment of voice abnormalities and enable the early diagnosis of voice pathology. For instance, in the early identification and diagnosis of voice problems, the automatic system for distinguishing healthy and diseased voices has gotten much attention. As a result, artificial intelligence-assisted voice analysis brings up new possibilities in healthcare. The work was aimed at assessing the utility of several automatic speech signal analysis methods for diagnosing voice disorders and suggesting a strategy for classifying healthy and diseased voices. The proposed framework integrates the efficacy of three voice characteristics: chroma, mel spectrogram, and mel frequency cepstral coefficient (MFCC). We also designed a deep neural network (DNN) capable of learning from the retrieved data and producing a highly accurate voice-based disease prediction model. The study describes a series of studies using the Saarbruecken Voice Database (SVD) to detect abnormal voices. The model was developed and tested using the vowels /a/, /i/, and /u/ pronounced in high, low, and average pitches. We also maintained the “continuous sentence” audio files collected from SVD to select how well the developed model generalizes to completely new data. The highest accuracy achieved was 77.49%, superior to prior attempts in the same domain. Additionally, the model attains an accuracy of 88.01% by integrating speaker gender information. The designed model trained on selected diseases can also obtain a maximum accuracy of 96.77% (cordectomy × healthy). As a result, the suggested framework is the best fit for the healthcare industry. |
format | Online Article Text |
id | pubmed-9071878 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Hindawi |
record_format | MEDLINE/PubMed |
spelling | pubmed-90718782022-05-06 An Analytical Study of Speech Pathology Detection Based on MFCC and Deep Neural Networks Zakariah, Mohammed B, Reshma Ajmi Alotaibi, Yousef Guo, Yanhui Tran-Trung, Kiet Elahi, Mohammad Mamun Comput Math Methods Med Research Article Diseases of internal organs other than the vocal folds can also affect a person's voice. As a result, voice problems are on the rise, even though they are frequently overlooked. According to a recent study, voice pathology detection systems can successfully help the assessment of voice abnormalities and enable the early diagnosis of voice pathology. For instance, in the early identification and diagnosis of voice problems, the automatic system for distinguishing healthy and diseased voices has gotten much attention. As a result, artificial intelligence-assisted voice analysis brings up new possibilities in healthcare. The work was aimed at assessing the utility of several automatic speech signal analysis methods for diagnosing voice disorders and suggesting a strategy for classifying healthy and diseased voices. The proposed framework integrates the efficacy of three voice characteristics: chroma, mel spectrogram, and mel frequency cepstral coefficient (MFCC). We also designed a deep neural network (DNN) capable of learning from the retrieved data and producing a highly accurate voice-based disease prediction model. The study describes a series of studies using the Saarbruecken Voice Database (SVD) to detect abnormal voices. The model was developed and tested using the vowels /a/, /i/, and /u/ pronounced in high, low, and average pitches. We also maintained the “continuous sentence” audio files collected from SVD to select how well the developed model generalizes to completely new data. The highest accuracy achieved was 77.49%, superior to prior attempts in the same domain. Additionally, the model attains an accuracy of 88.01% by integrating speaker gender information. The designed model trained on selected diseases can also obtain a maximum accuracy of 96.77% (cordectomy × healthy). As a result, the suggested framework is the best fit for the healthcare industry. Hindawi 2022-04-04 /pmc/articles/PMC9071878/ /pubmed/35529259 http://dx.doi.org/10.1155/2022/7814952 Text en Copyright © 2022 Mohammed Zakariah et al. https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Zakariah, Mohammed B, Reshma Ajmi Alotaibi, Yousef Guo, Yanhui Tran-Trung, Kiet Elahi, Mohammad Mamun An Analytical Study of Speech Pathology Detection Based on MFCC and Deep Neural Networks |
title | An Analytical Study of Speech Pathology Detection Based on MFCC and Deep Neural Networks |
title_full | An Analytical Study of Speech Pathology Detection Based on MFCC and Deep Neural Networks |
title_fullStr | An Analytical Study of Speech Pathology Detection Based on MFCC and Deep Neural Networks |
title_full_unstemmed | An Analytical Study of Speech Pathology Detection Based on MFCC and Deep Neural Networks |
title_short | An Analytical Study of Speech Pathology Detection Based on MFCC and Deep Neural Networks |
title_sort | analytical study of speech pathology detection based on mfcc and deep neural networks |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9071878/ https://www.ncbi.nlm.nih.gov/pubmed/35529259 http://dx.doi.org/10.1155/2022/7814952 |
work_keys_str_mv | AT zakariahmohammed ananalyticalstudyofspeechpathologydetectionbasedonmfccanddeepneuralnetworks AT breshma ananalyticalstudyofspeechpathologydetectionbasedonmfccanddeepneuralnetworks AT ajmialotaibiyousef ananalyticalstudyofspeechpathologydetectionbasedonmfccanddeepneuralnetworks AT guoyanhui ananalyticalstudyofspeechpathologydetectionbasedonmfccanddeepneuralnetworks AT trantrungkiet ananalyticalstudyofspeechpathologydetectionbasedonmfccanddeepneuralnetworks AT elahimohammadmamun ananalyticalstudyofspeechpathologydetectionbasedonmfccanddeepneuralnetworks AT zakariahmohammed analyticalstudyofspeechpathologydetectionbasedonmfccanddeepneuralnetworks AT breshma analyticalstudyofspeechpathologydetectionbasedonmfccanddeepneuralnetworks AT ajmialotaibiyousef analyticalstudyofspeechpathologydetectionbasedonmfccanddeepneuralnetworks AT guoyanhui analyticalstudyofspeechpathologydetectionbasedonmfccanddeepneuralnetworks AT trantrungkiet analyticalstudyofspeechpathologydetectionbasedonmfccanddeepneuralnetworks AT elahimohammadmamun analyticalstudyofspeechpathologydetectionbasedonmfccanddeepneuralnetworks |