Cargando…

Deep Learning-Based Detection of Articulatory Features in Arabic and English Speech

This study proposes using object detection techniques to recognize sequences of articulatory features (AFs) from speech utterances by treating AFs of phonemes as multi-label objects in speech spectrogram. The proposed system, called AFD-Obj, recognizes sequence of multi-label AFs in speech signal an...

Descripción completa

Detalles Bibliográficos
Autores principales:	Algabri, Mohammed, Mathkour, Hassan, Alsulaiman, Mansour M., Bencherif, Mohamed A.
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2021
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7914998/ https://www.ncbi.nlm.nih.gov/pubmed/33572169 http://dx.doi.org/10.3390/s21041205

_version_	1783657134859223040
author	Algabri, Mohammed Mathkour, Hassan Alsulaiman, Mansour M. Bencherif, Mohamed A.
author_facet	Algabri, Mohammed Mathkour, Hassan Alsulaiman, Mansour M. Bencherif, Mohamed A.
author_sort	Algabri, Mohammed
collection	PubMed
description	This study proposes using object detection techniques to recognize sequences of articulatory features (AFs) from speech utterances by treating AFs of phonemes as multi-label objects in speech spectrogram. The proposed system, called AFD-Obj, recognizes sequence of multi-label AFs in speech signal and localizes them. AFD-Obj consists of two main stages: firstly, we formulate the problem of AFs detection as an object detection problem and prepare the data to fulfill requirement of object detectors by generating a spectral three-channel image from the speech signal and creating the corresponding annotation for each utterance. Secondly, we use annotated images to train the proposed system to detect sequences of AFs and their boundaries. We test the system by feeding spectrogram images to the system, which will recognize and localize multi-label AFs. We investigated using these AFs to detect the utterance phonemes. YOLOv3-tiny detector is selected because of its real-time property and its support for multi-label detection. We test our AFD-Obj system on Arabic and English languages using KAPD and TIMIT corpora, respectively. Additionally, we propose using YOLOv3-tiny as an Arabic phoneme detection system (i.e., PD-Obj) to recognize and localize a sequence of Arabic phonemes from whole speech utterances. The proposed AFD-Obj and PD-Obj systems achieve excellent results for Arabic corpus and comparable to the state-of-the-art method for English corpus. Moreover, we showed that using only one-scale detection is suitable for AFs detection or phoneme recognition.
format	Online Article Text
id	pubmed-7914998
institution	National Center for Biotechnology Information
language	English
publishDate	2021
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-79149982021-03-01 Deep Learning-Based Detection of Articulatory Features in Arabic and English Speech Algabri, Mohammed Mathkour, Hassan Alsulaiman, Mansour M. Bencherif, Mohamed A. Sensors (Basel) Article This study proposes using object detection techniques to recognize sequences of articulatory features (AFs) from speech utterances by treating AFs of phonemes as multi-label objects in speech spectrogram. The proposed system, called AFD-Obj, recognizes sequence of multi-label AFs in speech signal and localizes them. AFD-Obj consists of two main stages: firstly, we formulate the problem of AFs detection as an object detection problem and prepare the data to fulfill requirement of object detectors by generating a spectral three-channel image from the speech signal and creating the corresponding annotation for each utterance. Secondly, we use annotated images to train the proposed system to detect sequences of AFs and their boundaries. We test the system by feeding spectrogram images to the system, which will recognize and localize multi-label AFs. We investigated using these AFs to detect the utterance phonemes. YOLOv3-tiny detector is selected because of its real-time property and its support for multi-label detection. We test our AFD-Obj system on Arabic and English languages using KAPD and TIMIT corpora, respectively. Additionally, we propose using YOLOv3-tiny as an Arabic phoneme detection system (i.e., PD-Obj) to recognize and localize a sequence of Arabic phonemes from whole speech utterances. The proposed AFD-Obj and PD-Obj systems achieve excellent results for Arabic corpus and comparable to the state-of-the-art method for English corpus. Moreover, we showed that using only one-scale detection is suitable for AFs detection or phoneme recognition. MDPI 2021-02-09 /pmc/articles/PMC7914998/ /pubmed/33572169 http://dx.doi.org/10.3390/s21041205 Text en © 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Algabri, Mohammed Mathkour, Hassan Alsulaiman, Mansour M. Bencherif, Mohamed A. Deep Learning-Based Detection of Articulatory Features in Arabic and English Speech
title	Deep Learning-Based Detection of Articulatory Features in Arabic and English Speech
title_full	Deep Learning-Based Detection of Articulatory Features in Arabic and English Speech
title_fullStr	Deep Learning-Based Detection of Articulatory Features in Arabic and English Speech
title_full_unstemmed	Deep Learning-Based Detection of Articulatory Features in Arabic and English Speech
title_short	Deep Learning-Based Detection of Articulatory Features in Arabic and English Speech
title_sort	deep learning-based detection of articulatory features in arabic and english speech
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7914998/ https://www.ncbi.nlm.nih.gov/pubmed/33572169 http://dx.doi.org/10.3390/s21041205
work_keys_str_mv	AT algabrimohammed deeplearningbaseddetectionofarticulatoryfeaturesinarabicandenglishspeech AT mathkourhassan deeplearningbaseddetectionofarticulatoryfeaturesinarabicandenglishspeech AT alsulaimanmansourm deeplearningbaseddetectionofarticulatoryfeaturesinarabicandenglishspeech AT bencherifmohameda deeplearningbaseddetectionofarticulatoryfeaturesinarabicandenglishspeech

Deep Learning-Based Detection of Articulatory Features in Arabic and English Speech

Ejemplares similares