Cargando…
Between-Class Adversarial Training for Improving Adversarial Robustness of Image Classification
Deep neural networks (DNNs) have been known to be vulnerable to adversarial attacks. Adversarial training (AT) is, so far, the only method that can guarantee the robustness of DNNs to adversarial attacks. However, the robustness generalization accuracy gain of AT is still far lower than the standard...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2023
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10057388/ https://www.ncbi.nlm.nih.gov/pubmed/36991962 http://dx.doi.org/10.3390/s23063252 |
_version_ | 1785016353425457152 |
---|---|
author | Wang, Desheng Jin, Weidong Wu, Yunpu |
author_facet | Wang, Desheng Jin, Weidong Wu, Yunpu |
author_sort | Wang, Desheng |
collection | PubMed |
description | Deep neural networks (DNNs) have been known to be vulnerable to adversarial attacks. Adversarial training (AT) is, so far, the only method that can guarantee the robustness of DNNs to adversarial attacks. However, the robustness generalization accuracy gain of AT is still far lower than the standard generalization accuracy of an undefended model, and there is known to be a trade-off between the standard generalization accuracy and the robustness generalization accuracy of an adversarially trained model. In order to improve the robustness generalization and the standard generalization performance trade-off of AT, we propose a novel defense algorithm called Between-Class Adversarial Training (BCAT) that combines Between-Class learning (BC-learning) with standard AT. Specifically, BCAT mixes two adversarial examples from different classes and uses the mixed between-class adversarial examples to train a model instead of original adversarial examples during AT. We further propose BCAT+ which adopts a more powerful mixing method. BCAT and BCAT+ impose effective regularization on the feature distribution of adversarial examples to enlarge between-class distance, thus improving the robustness generalization and the standard generalization performance of AT. The proposed algorithms do not introduce any hyperparameters into standard AT; therefore, the process of hyperparameters searching can be avoided. We evaluate the proposed algorithms under both white-box attacks and black-box attacks using a spectrum of perturbation values on CIFAR-10, CIFAR-100, and SVHN datasets. The research findings indicate that our algorithms achieve better global robustness generalization performance than the state-of-the-art adversarial defense methods. |
format | Online Article Text |
id | pubmed-10057388 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2023 |
publisher | MDPI |
record_format | MEDLINE/PubMed |
spelling | pubmed-100573882023-03-30 Between-Class Adversarial Training for Improving Adversarial Robustness of Image Classification Wang, Desheng Jin, Weidong Wu, Yunpu Sensors (Basel) Article Deep neural networks (DNNs) have been known to be vulnerable to adversarial attacks. Adversarial training (AT) is, so far, the only method that can guarantee the robustness of DNNs to adversarial attacks. However, the robustness generalization accuracy gain of AT is still far lower than the standard generalization accuracy of an undefended model, and there is known to be a trade-off between the standard generalization accuracy and the robustness generalization accuracy of an adversarially trained model. In order to improve the robustness generalization and the standard generalization performance trade-off of AT, we propose a novel defense algorithm called Between-Class Adversarial Training (BCAT) that combines Between-Class learning (BC-learning) with standard AT. Specifically, BCAT mixes two adversarial examples from different classes and uses the mixed between-class adversarial examples to train a model instead of original adversarial examples during AT. We further propose BCAT+ which adopts a more powerful mixing method. BCAT and BCAT+ impose effective regularization on the feature distribution of adversarial examples to enlarge between-class distance, thus improving the robustness generalization and the standard generalization performance of AT. The proposed algorithms do not introduce any hyperparameters into standard AT; therefore, the process of hyperparameters searching can be avoided. We evaluate the proposed algorithms under both white-box attacks and black-box attacks using a spectrum of perturbation values on CIFAR-10, CIFAR-100, and SVHN datasets. The research findings indicate that our algorithms achieve better global robustness generalization performance than the state-of-the-art adversarial defense methods. MDPI 2023-03-20 /pmc/articles/PMC10057388/ /pubmed/36991962 http://dx.doi.org/10.3390/s23063252 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). |
spellingShingle | Article Wang, Desheng Jin, Weidong Wu, Yunpu Between-Class Adversarial Training for Improving Adversarial Robustness of Image Classification |
title | Between-Class Adversarial Training for Improving Adversarial Robustness of Image Classification |
title_full | Between-Class Adversarial Training for Improving Adversarial Robustness of Image Classification |
title_fullStr | Between-Class Adversarial Training for Improving Adversarial Robustness of Image Classification |
title_full_unstemmed | Between-Class Adversarial Training for Improving Adversarial Robustness of Image Classification |
title_short | Between-Class Adversarial Training for Improving Adversarial Robustness of Image Classification |
title_sort | between-class adversarial training for improving adversarial robustness of image classification |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10057388/ https://www.ncbi.nlm.nih.gov/pubmed/36991962 http://dx.doi.org/10.3390/s23063252 |
work_keys_str_mv | AT wangdesheng betweenclassadversarialtrainingforimprovingadversarialrobustnessofimageclassification AT jinweidong betweenclassadversarialtrainingforimprovingadversarialrobustnessofimageclassification AT wuyunpu betweenclassadversarialtrainingforimprovingadversarialrobustnessofimageclassification |