Cargando…

Using SincNet for Learning Pathological Voice Disorders

Deep learning techniques such as convolutional neural networks (CNN) have been successfully applied to identify pathological voices. However, the major disadvantage of using these advanced models is the lack of interpretability in explaining the predicted outcomes. This drawback further introduces a...

Descripción completa

Detalles Bibliográficos
Autores principales:	Hung, Chao-Hsiang, Wang, Syu-Siang, Wang, Chi-Te, Fang, Shih-Hau
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2022
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9460101/ https://www.ncbi.nlm.nih.gov/pubmed/36081092 http://dx.doi.org/10.3390/s22176634

_version_	1784786663964147712
author	Hung, Chao-Hsiang Wang, Syu-Siang Wang, Chi-Te Fang, Shih-Hau
author_facet	Hung, Chao-Hsiang Wang, Syu-Siang Wang, Chi-Te Fang, Shih-Hau
author_sort	Hung, Chao-Hsiang
collection	PubMed
description	Deep learning techniques such as convolutional neural networks (CNN) have been successfully applied to identify pathological voices. However, the major disadvantage of using these advanced models is the lack of interpretability in explaining the predicted outcomes. This drawback further introduces a bottleneck for promoting the classification or detection of voice-disorder systems, especially in this pandemic period. In this paper, we proposed using a series of learnable sinc functions to replace the very first layer of a commonly used CNN to develop an explainable SincNet system for classifying or detecting pathological voices. The applied sinc filters, a front-end signal processor in SincNet, are critical for constructing the meaningful layer and are directly used to extract the acoustic features for following networks to generate high-level voice information. We conducted our tests on three different Far Eastern Memorial Hospital voice datasets. From our evaluations, the proposed approach achieves the highest 7%–accuracy and 9%–sensitivity improvements from conventional methods and thus demonstrates superior performance in predicting input pathological waveforms of the SincNet system. More importantly, we intended to give possible explanations between the system output and the first-layer extracted speech features based on our evaluated results.
format	Online Article Text
id	pubmed-9460101
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-94601012022-09-10 Using SincNet for Learning Pathological Voice Disorders Hung, Chao-Hsiang Wang, Syu-Siang Wang, Chi-Te Fang, Shih-Hau Sensors (Basel) Article Deep learning techniques such as convolutional neural networks (CNN) have been successfully applied to identify pathological voices. However, the major disadvantage of using these advanced models is the lack of interpretability in explaining the predicted outcomes. This drawback further introduces a bottleneck for promoting the classification or detection of voice-disorder systems, especially in this pandemic period. In this paper, we proposed using a series of learnable sinc functions to replace the very first layer of a commonly used CNN to develop an explainable SincNet system for classifying or detecting pathological voices. The applied sinc filters, a front-end signal processor in SincNet, are critical for constructing the meaningful layer and are directly used to extract the acoustic features for following networks to generate high-level voice information. We conducted our tests on three different Far Eastern Memorial Hospital voice datasets. From our evaluations, the proposed approach achieves the highest 7%–accuracy and 9%–sensitivity improvements from conventional methods and thus demonstrates superior performance in predicting input pathological waveforms of the SincNet system. More importantly, we intended to give possible explanations between the system output and the first-layer extracted speech features based on our evaluated results. MDPI 2022-09-02 /pmc/articles/PMC9460101/ /pubmed/36081092 http://dx.doi.org/10.3390/s22176634 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Hung, Chao-Hsiang Wang, Syu-Siang Wang, Chi-Te Fang, Shih-Hau Using SincNet for Learning Pathological Voice Disorders
title	Using SincNet for Learning Pathological Voice Disorders
title_full	Using SincNet for Learning Pathological Voice Disorders
title_fullStr	Using SincNet for Learning Pathological Voice Disorders
title_full_unstemmed	Using SincNet for Learning Pathological Voice Disorders
title_short	Using SincNet for Learning Pathological Voice Disorders
title_sort	using sincnet for learning pathological voice disorders
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9460101/ https://www.ncbi.nlm.nih.gov/pubmed/36081092 http://dx.doi.org/10.3390/s22176634
work_keys_str_mv	AT hungchaohsiang usingsincnetforlearningpathologicalvoicedisorders AT wangsyusiang usingsincnetforlearningpathologicalvoicedisorders AT wangchite usingsincnetforlearningpathologicalvoicedisorders AT fangshihhau usingsincnetforlearningpathologicalvoicedisorders

Using SincNet for Learning Pathological Voice Disorders

Ejemplares similares