Cargando…
Closed-set automatic speaker identification using multi-scale recurrent networks in non-native children
Children may benefit from automatic speaker identification in a variety of applications, including child security, safety, and education. The key focus of this study is to develop a closed-set child speaker identification system for non-native speakers of English in both text-dependent and text-inde...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Springer Nature Singapore
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10023307/ https://www.ncbi.nlm.nih.gov/pubmed/37056796 http://dx.doi.org/10.1007/s41870-023-01224-8 |
_version_ | 1784908895865536512 |
---|---|
author | Radha, Kodali Bansal, Mohan |
author_facet | Radha, Kodali Bansal, Mohan |
author_sort | Radha, Kodali |
collection | PubMed |
description | Children may benefit from automatic speaker identification in a variety of applications, including child security, safety, and education. The key focus of this study is to develop a closed-set child speaker identification system for non-native speakers of English in both text-dependent and text-independent speech tasks in order to track how the speaker’s fluency affects the system. The multi-scale wavelet scattering transform is used to compensate for concerns like the loss of high-frequency information caused by the most widely used mel frequency cepstral coefficients feature extractor. The proposed large-scale speaker identification system succeeds well by employing wavelet scattered Bi-LSTM. While this procedure is used to identify non-native children in multiple classes, average values of accuracy, precision, recall, and F-measure are being used to assess the performance of the model in text-independent and text-dependent tasks, which outperforms the existing models. |
format | Online Article Text |
id | pubmed-10023307 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | Springer Nature Singapore |
record_format | MEDLINE/PubMed |
spelling | pubmed-100233072023-03-21 Closed-set automatic speaker identification using multi-scale recurrent networks in non-native children Radha, Kodali Bansal, Mohan Int J Inf Technol Original Research Children may benefit from automatic speaker identification in a variety of applications, including child security, safety, and education. The key focus of this study is to develop a closed-set child speaker identification system for non-native speakers of English in both text-dependent and text-independent speech tasks in order to track how the speaker’s fluency affects the system. The multi-scale wavelet scattering transform is used to compensate for concerns like the loss of high-frequency information caused by the most widely used mel frequency cepstral coefficients feature extractor. The proposed large-scale speaker identification system succeeds well by employing wavelet scattered Bi-LSTM. While this procedure is used to identify non-native children in multiple classes, average values of accuracy, precision, recall, and F-measure are being used to assess the performance of the model in text-independent and text-dependent tasks, which outperforms the existing models. Springer Nature Singapore 2023-03-18 2023 /pmc/articles/PMC10023307/ /pubmed/37056796 http://dx.doi.org/10.1007/s41870-023-01224-8 Text en © The Author(s), under exclusive licence to Bharati Vidyapeeth's Institute of Computer Applications and Management 2023, Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law. This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic. |
spellingShingle | Original Research Radha, Kodali Bansal, Mohan Closed-set automatic speaker identification using multi-scale recurrent networks in non-native children |
title | Closed-set automatic speaker identification using multi-scale recurrent networks in non-native children |
title_full | Closed-set automatic speaker identification using multi-scale recurrent networks in non-native children |
title_fullStr | Closed-set automatic speaker identification using multi-scale recurrent networks in non-native children |
title_full_unstemmed | Closed-set automatic speaker identification using multi-scale recurrent networks in non-native children |
title_short | Closed-set automatic speaker identification using multi-scale recurrent networks in non-native children |
title_sort | closed-set automatic speaker identification using multi-scale recurrent networks in non-native children |
topic | Original Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10023307/ https://www.ncbi.nlm.nih.gov/pubmed/37056796 http://dx.doi.org/10.1007/s41870-023-01224-8 |
work_keys_str_mv | AT radhakodali closedsetautomaticspeakeridentificationusingmultiscalerecurrentnetworksinnonnativechildren AT bansalmohan closedsetautomaticspeakeridentificationusingmultiscalerecurrentnetworksinnonnativechildren |