Cargando…

Infant Cry Signal Diagnostic System Using Deep Learning and Fused Features

Early diagnosis of medical conditions in infants is crucial for ensuring timely and effective treatment. However, infants are unable to verbalize their symptoms, making it difficult for healthcare professionals to accurately diagnose their conditions. Crying is often the only way for infants to comm...

Descripción completa

Detalles Bibliográficos
Autores principales: Zayed, Yara, Hasasneh, Ahmad, Tadj, Chakib
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10297367/
https://www.ncbi.nlm.nih.gov/pubmed/37371002
http://dx.doi.org/10.3390/diagnostics13122107
_version_ 1785063867108294656
author Zayed, Yara
Hasasneh, Ahmad
Tadj, Chakib
author_facet Zayed, Yara
Hasasneh, Ahmad
Tadj, Chakib
author_sort Zayed, Yara
collection PubMed
description Early diagnosis of medical conditions in infants is crucial for ensuring timely and effective treatment. However, infants are unable to verbalize their symptoms, making it difficult for healthcare professionals to accurately diagnose their conditions. Crying is often the only way for infants to communicate their needs and discomfort. In this paper, we propose a medical diagnostic system for interpreting infants’ cry audio signals (CAS) using a combination of different audio domain features and deep learning (DL) algorithms. The proposed system utilizes a dataset of labeled audio signals from infants with specific pathologies. The dataset includes two infant pathologies with high mortality rates, neonatal respiratory distress syndrome (RDS), sepsis, and crying. The system employed the harmonic ratio (HR) as a prosodic feature, the Gammatone frequency cepstral coefficients (GFCCs) as a cepstral feature, and image-based features through the spectrogram which are extracted using a convolution neural network (CNN) pretrained model and fused with the other features to benefit multiple domains in improving the classification rate and the accuracy of the model. The different combination of the fused features is then fed into multiple machine learning algorithms including random forest (RF), support vector machine (SVM), and deep neural network (DNN) models. The evaluation of the system using the accuracy, precision, recall, F1-score, confusion matrix, and receiver operating characteristic (ROC) curve, showed promising results for the early diagnosis of medical conditions in infants based on the crying signals only, where the system achieved the highest accuracy of 97.50% using the combination of the spectrogram, HR, and GFCC through the deep learning process. The finding demonstrated the importance of fusing different audio features, especially the spectrogram, through the learning process rather than a simple concatenation and the use of deep learning algorithms in extracting sparsely represented features that can be used later on in the classification problem, which improves the separation between different infants’ pathologies. The results outperformed the published benchmark paper by improving the classification problem to be multiclassification (RDS, sepsis, and healthy), investigating a new type of feature, which is the spectrogram, using a new feature fusion technique, which is fusion, through the learning process using the deep learning model.
format Online
Article
Text
id pubmed-10297367
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-102973672023-06-28 Infant Cry Signal Diagnostic System Using Deep Learning and Fused Features Zayed, Yara Hasasneh, Ahmad Tadj, Chakib Diagnostics (Basel) Article Early diagnosis of medical conditions in infants is crucial for ensuring timely and effective treatment. However, infants are unable to verbalize their symptoms, making it difficult for healthcare professionals to accurately diagnose their conditions. Crying is often the only way for infants to communicate their needs and discomfort. In this paper, we propose a medical diagnostic system for interpreting infants’ cry audio signals (CAS) using a combination of different audio domain features and deep learning (DL) algorithms. The proposed system utilizes a dataset of labeled audio signals from infants with specific pathologies. The dataset includes two infant pathologies with high mortality rates, neonatal respiratory distress syndrome (RDS), sepsis, and crying. The system employed the harmonic ratio (HR) as a prosodic feature, the Gammatone frequency cepstral coefficients (GFCCs) as a cepstral feature, and image-based features through the spectrogram which are extracted using a convolution neural network (CNN) pretrained model and fused with the other features to benefit multiple domains in improving the classification rate and the accuracy of the model. The different combination of the fused features is then fed into multiple machine learning algorithms including random forest (RF), support vector machine (SVM), and deep neural network (DNN) models. The evaluation of the system using the accuracy, precision, recall, F1-score, confusion matrix, and receiver operating characteristic (ROC) curve, showed promising results for the early diagnosis of medical conditions in infants based on the crying signals only, where the system achieved the highest accuracy of 97.50% using the combination of the spectrogram, HR, and GFCC through the deep learning process. The finding demonstrated the importance of fusing different audio features, especially the spectrogram, through the learning process rather than a simple concatenation and the use of deep learning algorithms in extracting sparsely represented features that can be used later on in the classification problem, which improves the separation between different infants’ pathologies. The results outperformed the published benchmark paper by improving the classification problem to be multiclassification (RDS, sepsis, and healthy), investigating a new type of feature, which is the spectrogram, using a new feature fusion technique, which is fusion, through the learning process using the deep learning model. MDPI 2023-06-19 /pmc/articles/PMC10297367/ /pubmed/37371002 http://dx.doi.org/10.3390/diagnostics13122107 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Zayed, Yara
Hasasneh, Ahmad
Tadj, Chakib
Infant Cry Signal Diagnostic System Using Deep Learning and Fused Features
title Infant Cry Signal Diagnostic System Using Deep Learning and Fused Features
title_full Infant Cry Signal Diagnostic System Using Deep Learning and Fused Features
title_fullStr Infant Cry Signal Diagnostic System Using Deep Learning and Fused Features
title_full_unstemmed Infant Cry Signal Diagnostic System Using Deep Learning and Fused Features
title_short Infant Cry Signal Diagnostic System Using Deep Learning and Fused Features
title_sort infant cry signal diagnostic system using deep learning and fused features
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10297367/
https://www.ncbi.nlm.nih.gov/pubmed/37371002
http://dx.doi.org/10.3390/diagnostics13122107
work_keys_str_mv AT zayedyara infantcrysignaldiagnosticsystemusingdeeplearningandfusedfeatures
AT hasasnehahmad infantcrysignaldiagnosticsystemusingdeeplearningandfusedfeatures
AT tadjchakib infantcrysignaldiagnosticsystemusingdeeplearningandfusedfeatures