Cargando…

Closed-set automatic speaker identification using multi-scale recurrent networks in non-native children

Children may benefit from automatic speaker identification in a variety of applications, including child security, safety, and education. The key focus of this study is to develop a closed-set child speaker identification system for non-native speakers of English in both text-dependent and text-inde...

Descripción completa

Detalles Bibliográficos
Autores principales: Radha, Kodali, Bansal, Mohan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Springer Nature Singapore 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10023307/
https://www.ncbi.nlm.nih.gov/pubmed/37056796
http://dx.doi.org/10.1007/s41870-023-01224-8
_version_ 1784908895865536512
author Radha, Kodali
Bansal, Mohan
author_facet Radha, Kodali
Bansal, Mohan
author_sort Radha, Kodali
collection PubMed
description Children may benefit from automatic speaker identification in a variety of applications, including child security, safety, and education. The key focus of this study is to develop a closed-set child speaker identification system for non-native speakers of English in both text-dependent and text-independent speech tasks in order to track how the speaker’s fluency affects the system. The multi-scale wavelet scattering transform is used to compensate for concerns like the loss of high-frequency information caused by the most widely used mel frequency cepstral coefficients feature extractor. The proposed large-scale speaker identification system succeeds well by employing wavelet scattered Bi-LSTM. While this procedure is used to identify non-native children in multiple classes, average values of accuracy, precision, recall, and F-measure are being used to assess the performance of the model in text-independent and text-dependent tasks, which outperforms the existing models.
format Online
Article
Text
id pubmed-10023307
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Springer Nature Singapore
record_format MEDLINE/PubMed
spelling pubmed-100233072023-03-21 Closed-set automatic speaker identification using multi-scale recurrent networks in non-native children Radha, Kodali Bansal, Mohan Int J Inf Technol Original Research Children may benefit from automatic speaker identification in a variety of applications, including child security, safety, and education. The key focus of this study is to develop a closed-set child speaker identification system for non-native speakers of English in both text-dependent and text-independent speech tasks in order to track how the speaker’s fluency affects the system. The multi-scale wavelet scattering transform is used to compensate for concerns like the loss of high-frequency information caused by the most widely used mel frequency cepstral coefficients feature extractor. The proposed large-scale speaker identification system succeeds well by employing wavelet scattered Bi-LSTM. While this procedure is used to identify non-native children in multiple classes, average values of accuracy, precision, recall, and F-measure are being used to assess the performance of the model in text-independent and text-dependent tasks, which outperforms the existing models. Springer Nature Singapore 2023-03-18 2023 /pmc/articles/PMC10023307/ /pubmed/37056796 http://dx.doi.org/10.1007/s41870-023-01224-8 Text en © The Author(s), under exclusive licence to Bharati Vidyapeeth's Institute of Computer Applications and Management 2023, Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law. This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle Original Research
Radha, Kodali
Bansal, Mohan
Closed-set automatic speaker identification using multi-scale recurrent networks in non-native children
title Closed-set automatic speaker identification using multi-scale recurrent networks in non-native children
title_full Closed-set automatic speaker identification using multi-scale recurrent networks in non-native children
title_fullStr Closed-set automatic speaker identification using multi-scale recurrent networks in non-native children
title_full_unstemmed Closed-set automatic speaker identification using multi-scale recurrent networks in non-native children
title_short Closed-set automatic speaker identification using multi-scale recurrent networks in non-native children
title_sort closed-set automatic speaker identification using multi-scale recurrent networks in non-native children
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10023307/
https://www.ncbi.nlm.nih.gov/pubmed/37056796
http://dx.doi.org/10.1007/s41870-023-01224-8
work_keys_str_mv AT radhakodali closedsetautomaticspeakeridentificationusingmultiscalerecurrentnetworksinnonnativechildren
AT bansalmohan closedsetautomaticspeakeridentificationusingmultiscalerecurrentnetworksinnonnativechildren