Cargando…

Using SincNet for Learning Pathological Voice Disorders

Deep learning techniques such as convolutional neural networks (CNN) have been successfully applied to identify pathological voices. However, the major disadvantage of using these advanced models is the lack of interpretability in explaining the predicted outcomes. This drawback further introduces a...

Descripción completa

Detalles Bibliográficos
Autores principales: Hung, Chao-Hsiang, Wang, Syu-Siang, Wang, Chi-Te, Fang, Shih-Hau
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9460101/
https://www.ncbi.nlm.nih.gov/pubmed/36081092
http://dx.doi.org/10.3390/s22176634
_version_ 1784786663964147712
author Hung, Chao-Hsiang
Wang, Syu-Siang
Wang, Chi-Te
Fang, Shih-Hau
author_facet Hung, Chao-Hsiang
Wang, Syu-Siang
Wang, Chi-Te
Fang, Shih-Hau
author_sort Hung, Chao-Hsiang
collection PubMed
description Deep learning techniques such as convolutional neural networks (CNN) have been successfully applied to identify pathological voices. However, the major disadvantage of using these advanced models is the lack of interpretability in explaining the predicted outcomes. This drawback further introduces a bottleneck for promoting the classification or detection of voice-disorder systems, especially in this pandemic period. In this paper, we proposed using a series of learnable sinc functions to replace the very first layer of a commonly used CNN to develop an explainable SincNet system for classifying or detecting pathological voices. The applied sinc filters, a front-end signal processor in SincNet, are critical for constructing the meaningful layer and are directly used to extract the acoustic features for following networks to generate high-level voice information. We conducted our tests on three different Far Eastern Memorial Hospital voice datasets. From our evaluations, the proposed approach achieves the highest 7%–accuracy and 9%–sensitivity improvements from conventional methods and thus demonstrates superior performance in predicting input pathological waveforms of the SincNet system. More importantly, we intended to give possible explanations between the system output and the first-layer extracted speech features based on our evaluated results.
format Online
Article
Text
id pubmed-9460101
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-94601012022-09-10 Using SincNet for Learning Pathological Voice Disorders Hung, Chao-Hsiang Wang, Syu-Siang Wang, Chi-Te Fang, Shih-Hau Sensors (Basel) Article Deep learning techniques such as convolutional neural networks (CNN) have been successfully applied to identify pathological voices. However, the major disadvantage of using these advanced models is the lack of interpretability in explaining the predicted outcomes. This drawback further introduces a bottleneck for promoting the classification or detection of voice-disorder systems, especially in this pandemic period. In this paper, we proposed using a series of learnable sinc functions to replace the very first layer of a commonly used CNN to develop an explainable SincNet system for classifying or detecting pathological voices. The applied sinc filters, a front-end signal processor in SincNet, are critical for constructing the meaningful layer and are directly used to extract the acoustic features for following networks to generate high-level voice information. We conducted our tests on three different Far Eastern Memorial Hospital voice datasets. From our evaluations, the proposed approach achieves the highest 7%–accuracy and 9%–sensitivity improvements from conventional methods and thus demonstrates superior performance in predicting input pathological waveforms of the SincNet system. More importantly, we intended to give possible explanations between the system output and the first-layer extracted speech features based on our evaluated results. MDPI 2022-09-02 /pmc/articles/PMC9460101/ /pubmed/36081092 http://dx.doi.org/10.3390/s22176634 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Hung, Chao-Hsiang
Wang, Syu-Siang
Wang, Chi-Te
Fang, Shih-Hau
Using SincNet for Learning Pathological Voice Disorders
title Using SincNet for Learning Pathological Voice Disorders
title_full Using SincNet for Learning Pathological Voice Disorders
title_fullStr Using SincNet for Learning Pathological Voice Disorders
title_full_unstemmed Using SincNet for Learning Pathological Voice Disorders
title_short Using SincNet for Learning Pathological Voice Disorders
title_sort using sincnet for learning pathological voice disorders
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9460101/
https://www.ncbi.nlm.nih.gov/pubmed/36081092
http://dx.doi.org/10.3390/s22176634
work_keys_str_mv AT hungchaohsiang usingsincnetforlearningpathologicalvoicedisorders
AT wangsyusiang usingsincnetforlearningpathologicalvoicedisorders
AT wangchite usingsincnetforlearningpathologicalvoicedisorders
AT fangshihhau usingsincnetforlearningpathologicalvoicedisorders