Cargando…

Speaker Recognition Using Constrained Convolutional Neural Networks in Emotional Speech

Speaker recognition is an important classification task, which can be solved using several approaches. Although building a speaker recognition model on a closed set of speakers under neutral speaking conditions is a well-researched task and there are solutions that provide excellent performance, the...

Descripción completa

Detalles Bibliográficos
Autores principales:	Simić, Nikola, Suzić, Siniša, Nosek, Tijana, Vujović, Mia, Perić, Zoran, Savić, Milan, Delić, Vlado
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2022
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8947568/ https://www.ncbi.nlm.nih.gov/pubmed/35327924 http://dx.doi.org/10.3390/e24030414

_version_	1784674470838927360
author	Simić, Nikola Suzić, Siniša Nosek, Tijana Vujović, Mia Perić, Zoran Savić, Milan Delić, Vlado
author_facet	Simić, Nikola Suzić, Siniša Nosek, Tijana Vujović, Mia Perić, Zoran Savić, Milan Delić, Vlado
author_sort	Simić, Nikola
collection	PubMed
description	Speaker recognition is an important classification task, which can be solved using several approaches. Although building a speaker recognition model on a closed set of speakers under neutral speaking conditions is a well-researched task and there are solutions that provide excellent performance, the classification accuracy of developed models significantly decreases when applying them to emotional speech or in the presence of interference. Furthermore, deep models may require a large number of parameters, so constrained solutions are desirable in order to implement them on edge devices in the Internet of Things systems for real-time detection. The aim of this paper is to propose a simple and constrained convolutional neural network for speaker recognition tasks and to examine its robustness for recognition in emotional speech conditions. We examine three quantization methods for developing a constrained network: floating-point eight format, ternary scalar quantization, and binary scalar quantization. The results are demonstrated on the recently recorded SEAC dataset.
format	Online Article Text
id	pubmed-8947568
institution	National Center for Biotechnology Information
language	English
publishDate	2022
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-89475682022-03-25 Speaker Recognition Using Constrained Convolutional Neural Networks in Emotional Speech Simić, Nikola Suzić, Siniša Nosek, Tijana Vujović, Mia Perić, Zoran Savić, Milan Delić, Vlado Entropy (Basel) Article Speaker recognition is an important classification task, which can be solved using several approaches. Although building a speaker recognition model on a closed set of speakers under neutral speaking conditions is a well-researched task and there are solutions that provide excellent performance, the classification accuracy of developed models significantly decreases when applying them to emotional speech or in the presence of interference. Furthermore, deep models may require a large number of parameters, so constrained solutions are desirable in order to implement them on edge devices in the Internet of Things systems for real-time detection. The aim of this paper is to propose a simple and constrained convolutional neural network for speaker recognition tasks and to examine its robustness for recognition in emotional speech conditions. We examine three quantization methods for developing a constrained network: floating-point eight format, ternary scalar quantization, and binary scalar quantization. The results are demonstrated on the recently recorded SEAC dataset. MDPI 2022-03-16 /pmc/articles/PMC8947568/ /pubmed/35327924 http://dx.doi.org/10.3390/e24030414 Text en © 2022 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Simić, Nikola Suzić, Siniša Nosek, Tijana Vujović, Mia Perić, Zoran Savić, Milan Delić, Vlado Speaker Recognition Using Constrained Convolutional Neural Networks in Emotional Speech
title	Speaker Recognition Using Constrained Convolutional Neural Networks in Emotional Speech
title_full	Speaker Recognition Using Constrained Convolutional Neural Networks in Emotional Speech
title_fullStr	Speaker Recognition Using Constrained Convolutional Neural Networks in Emotional Speech
title_full_unstemmed	Speaker Recognition Using Constrained Convolutional Neural Networks in Emotional Speech
title_short	Speaker Recognition Using Constrained Convolutional Neural Networks in Emotional Speech
title_sort	speaker recognition using constrained convolutional neural networks in emotional speech
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8947568/ https://www.ncbi.nlm.nih.gov/pubmed/35327924 http://dx.doi.org/10.3390/e24030414
work_keys_str_mv	AT simicnikola speakerrecognitionusingconstrainedconvolutionalneuralnetworksinemotionalspeech AT suzicsinisa speakerrecognitionusingconstrainedconvolutionalneuralnetworksinemotionalspeech AT nosektijana speakerrecognitionusingconstrainedconvolutionalneuralnetworksinemotionalspeech AT vujovicmia speakerrecognitionusingconstrainedconvolutionalneuralnetworksinemotionalspeech AT periczoran speakerrecognitionusingconstrainedconvolutionalneuralnetworksinemotionalspeech AT savicmilan speakerrecognitionusingconstrainedconvolutionalneuralnetworksinemotionalspeech AT delicvlado speakerrecognitionusingconstrainedconvolutionalneuralnetworksinemotionalspeech

Speaker Recognition Using Constrained Convolutional Neural Networks in Emotional Speech

Ejemplares similares