Cargando…
Using SincNet for Learning Pathological Voice Disorders
Deep learning techniques such as convolutional neural networks (CNN) have been successfully applied to identify pathological voices. However, the major disadvantage of using these advanced models is the lack of interpretability in explaining the predicted outcomes. This drawback further introduces a...
Autores principales: | , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9460101/ https://www.ncbi.nlm.nih.gov/pubmed/36081092 http://dx.doi.org/10.3390/s22176634 |
_version_ | 1784786663964147712 |
---|---|
author | Hung, Chao-Hsiang Wang, Syu-Siang Wang, Chi-Te Fang, Shih-Hau |
author_facet | Hung, Chao-Hsiang Wang, Syu-Siang Wang, Chi-Te Fang, Shih-Hau |
author_sort | Hung, Chao-Hsiang |
collection | PubMed |
description | Deep learning techniques such as convolutional neural networks (CNN) have been successfully applied to identify pathological voices. However, the major disadvantage of using these advanced models is the lack of interpretability in explaining the predicted outcomes. This drawback further introduces a bottleneck for promoting the classification or detection of voice-disorder systems, especially in this pandemic period. In this paper, we proposed using a series of learnable sinc functions to replace the very first layer of a commonly used CNN to develop an explainable SincNet system for classifying or detecting pathological voices. The applied sinc filters, a front-end signal processor in SincNet, are critical for constructing the meaningful layer and are directly used to extract the acoustic features for following networks to generate high-level voice information. We conducted our tests on three different Far Eastern Memorial Hospital voice datasets. From our evaluations, the proposed approach achieves the highest 7%–accuracy and 9%–sensitivity improvements from conventional methods and thus demonstrates superior performance in predicting input pathological waveforms of the SincNet system. More importantly, we intended to give possible explanations between the system output and the first-layer extracted speech features based on our evaluated results. |
format | Online Article Text |
id | pubmed-9460101 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-94601012022-09-10 Using SincNet for Learning Pathological Voice Disorders Hung, Chao-Hsiang Wang, Syu-Siang Wang, Chi-Te Fang, Shih-Hau Sensors (Basel) Article Deep learning techniques such as convolutional neural networks (CNN) have been successfully applied to identify pathological voices. However, the major disadvantage of using these advanced models is the lack of interpretability in explaining the predicted outcomes. This drawback further introduces a bottleneck for promoting the classification or detection of voice-disorder systems, especially in this pandemic period. In this paper, we proposed using a series of learnable sinc functions to replace the very first layer of a commonly used CNN to develop an explainable SincNet system for classifying or detecting pathological voices. The applied sinc filters, a front-end signal processor in SincNet, are critical for constructing the meaningful layer and are directly used to extract the acoustic features for following networks to generate high-level voice information. We conducted our tests on three different Far Eastern Memorial Hospital voice datasets. From our evaluations, the proposed approach achieves the highest 7%–accuracy and 9%–sensitivity improvements from conventional methods and thus demonstrates superior performance in predicting input pathological waveforms of the SincNet system. More importantly, we intended to give possible explanations between the system output and the first-layer extracted speech features based on our evaluated results. MDPI 2022-09-02 /pmc/articles/PMC9460101/ /pubmed/36081092 http://dx.doi.org/10.3390/s22176634 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Hung, Chao-Hsiang Wang, Syu-Siang Wang, Chi-Te Fang, Shih-Hau Using SincNet for Learning Pathological Voice Disorders |
title | Using SincNet for Learning Pathological Voice Disorders |
title_full | Using SincNet for Learning Pathological Voice Disorders |
title_fullStr | Using SincNet for Learning Pathological Voice Disorders |
title_full_unstemmed | Using SincNet for Learning Pathological Voice Disorders |
title_short | Using SincNet for Learning Pathological Voice Disorders |
title_sort | using sincnet for learning pathological voice disorders |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9460101/ https://www.ncbi.nlm.nih.gov/pubmed/36081092 http://dx.doi.org/10.3390/s22176634 |
work_keys_str_mv | AT hungchaohsiang usingsincnetforlearningpathologicalvoicedisorders AT wangsyusiang usingsincnetforlearningpathologicalvoicedisorders AT wangchite usingsincnetforlearningpathologicalvoicedisorders AT fangshihhau usingsincnetforlearningpathologicalvoicedisorders |