Cargando…

Efficient Attention Branch Network with Combined Loss Function for Automatic Speaker Verification Spoof Detection

Many endeavors have sought to develop countermeasure techniques as enhancements on Automatic Speaker Verification (ASV) systems, in order to make them more robust against spoof attacks. As evidenced by the latest ASVspoof 2019 countermeasure challenge, models currently deployed for the task of ASV a...

Descripción completa

Detalles Bibliográficos
Autores principales:	Rostami, Amir Mohammad, Homayounpour, Mohammad Mehdi, Nickabadi, Ahmad
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	Springer US 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9947936/ https://www.ncbi.nlm.nih.gov/pubmed/36852137 http://dx.doi.org/10.1007/s00034-023-02314-5

_version_	1784892669695098880
author	Rostami, Amir Mohammad Homayounpour, Mohammad Mehdi Nickabadi, Ahmad
author_facet	Rostami, Amir Mohammad Homayounpour, Mohammad Mehdi Nickabadi, Ahmad
author_sort	Rostami, Amir Mohammad
collection	PubMed
description	Many endeavors have sought to develop countermeasure techniques as enhancements on Automatic Speaker Verification (ASV) systems, in order to make them more robust against spoof attacks. As evidenced by the latest ASVspoof 2019 countermeasure challenge, models currently deployed for the task of ASV are, at their best, devoid of suitable degrees of generalization to unseen attacks. A joint improvement of components of ASV spoof detection systems including the classifier, feature extraction phase, and model loss function may lead to a better detection of attacks by these systems. Accordingly, the present study proposes the Efficient Attention Branch Network (EABN) architecture with a combined loss function to address the model generalization to unseen attacks. The EABN is based on attention and perception branches. The attention branch provides an attention mask that improves the classification performance and at the same time is interpretable from a human point of view. The perception branch, is used for our main purpose which is spoof detection. The new EfficientNet-A0 architecture was optimized and employed for the perception branch, with nearly ten times fewer parameters and approximately seven times fewer floating-point operations than the SE-Res2Net50 as the best existing network. The proposed method on ASVspoof 2019 dataset achieved EER = 0.86% and t-DCF = 0.0239 in the Physical Access (PA) scenario using the logPowSpec as the input feature extraction method. Furthermore, using the LFCC feature, and the SE-Res2Net50 for the perception branch, the proposed model achieved EER = 1.89% and t-DCF = 0.507 in the Logical Access (LA) scenario, which to the best of our knowledge, is the best single system ASV spoofing countermeasure method.
format	Online Article Text
id	pubmed-9947936
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	Springer US
record_format	MEDLINE/PubMed
spelling	pubmed-99479362023-02-23 Efficient Attention Branch Network with Combined Loss Function for Automatic Speaker Verification Spoof Detection Rostami, Amir Mohammad Homayounpour, Mohammad Mehdi Nickabadi, Ahmad Circuits Syst Signal Process Article Many endeavors have sought to develop countermeasure techniques as enhancements on Automatic Speaker Verification (ASV) systems, in order to make them more robust against spoof attacks. As evidenced by the latest ASVspoof 2019 countermeasure challenge, models currently deployed for the task of ASV are, at their best, devoid of suitable degrees of generalization to unseen attacks. A joint improvement of components of ASV spoof detection systems including the classifier, feature extraction phase, and model loss function may lead to a better detection of attacks by these systems. Accordingly, the present study proposes the Efficient Attention Branch Network (EABN) architecture with a combined loss function to address the model generalization to unseen attacks. The EABN is based on attention and perception branches. The attention branch provides an attention mask that improves the classification performance and at the same time is interpretable from a human point of view. The perception branch, is used for our main purpose which is spoof detection. The new EfficientNet-A0 architecture was optimized and employed for the perception branch, with nearly ten times fewer parameters and approximately seven times fewer floating-point operations than the SE-Res2Net50 as the best existing network. The proposed method on ASVspoof 2019 dataset achieved EER = 0.86% and t-DCF = 0.0239 in the Physical Access (PA) scenario using the logPowSpec as the input feature extraction method. Furthermore, using the LFCC feature, and the SE-Res2Net50 for the perception branch, the proposed model achieved EER = 1.89% and t-DCF = 0.507 in the Logical Access (LA) scenario, which to the best of our knowledge, is the best single system ASV spoofing countermeasure method. Springer US 2023-02-23 /pmc/articles/PMC9947936/ /pubmed/36852137 http://dx.doi.org/10.1007/s00034-023-02314-5 Text en © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2023, Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law. This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.
spellingShingle	Article Rostami, Amir Mohammad Homayounpour, Mohammad Mehdi Nickabadi, Ahmad Efficient Attention Branch Network with Combined Loss Function for Automatic Speaker Verification Spoof Detection
title	Efficient Attention Branch Network with Combined Loss Function for Automatic Speaker Verification Spoof Detection
title_full	Efficient Attention Branch Network with Combined Loss Function for Automatic Speaker Verification Spoof Detection
title_fullStr	Efficient Attention Branch Network with Combined Loss Function for Automatic Speaker Verification Spoof Detection
title_full_unstemmed	Efficient Attention Branch Network with Combined Loss Function for Automatic Speaker Verification Spoof Detection
title_short	Efficient Attention Branch Network with Combined Loss Function for Automatic Speaker Verification Spoof Detection
title_sort	efficient attention branch network with combined loss function for automatic speaker verification spoof detection
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9947936/ https://www.ncbi.nlm.nih.gov/pubmed/36852137 http://dx.doi.org/10.1007/s00034-023-02314-5
work_keys_str_mv	AT rostamiamirmohammad efficientattentionbranchnetworkwithcombinedlossfunctionforautomaticspeakerverificationspoofdetection AT homayounpourmohammadmehdi efficientattentionbranchnetworkwithcombinedlossfunctionforautomaticspeakerverificationspoofdetection AT nickabadiahmad efficientattentionbranchnetworkwithcombinedlossfunctionforautomaticspeakerverificationspoofdetection

Efficient Attention Branch Network with Combined Loss Function for Automatic Speaker Verification Spoof Detection

Ejemplares similares