Cargando…

Disrupting adversarial transferability in deep neural networks

Adversarial attack transferability is well recognized in deep learning. Previous work has partially explained transferability by recognizing common adversarial subspaces and correlations between decision boundaries, but little is known beyond that. We propose that transferability between seemingly d...

Descripción completa

Detalles Bibliográficos
Autores principales: Wiedeman, Christopher, Wang, Ge
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Elsevier 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9122968/
https://www.ncbi.nlm.nih.gov/pubmed/35607626
http://dx.doi.org/10.1016/j.patter.2022.100472
_version_ 1784711461005688832
author Wiedeman, Christopher
Wang, Ge
author_facet Wiedeman, Christopher
Wang, Ge
author_sort Wiedeman, Christopher
collection PubMed
description Adversarial attack transferability is well recognized in deep learning. Previous work has partially explained transferability by recognizing common adversarial subspaces and correlations between decision boundaries, but little is known beyond that. We propose that transferability between seemingly different models is due to a high linear correlation between the feature sets that different networks extract. In other words, two models trained on the same task that are distant in the parameter space likely extract features in the same fashion, linked by trivial affine transformations between the latent spaces. Furthermore, we show how applying a feature correlation loss, which decorrelates the extracted features in corresponding latent spaces, can reduce the transferability of adversarial attacks between models, suggesting that the models complete tasks in semantically different ways. Finally, we propose a dual-neck autoencoder (DNA), which leverages this feature correlation loss to create two meaningfully different encodings of input information with reduced transferability.
format Online
Article
Text
id pubmed-9122968
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Elsevier
record_format MEDLINE/PubMed
spelling pubmed-91229682022-05-22 Disrupting adversarial transferability in deep neural networks Wiedeman, Christopher Wang, Ge Patterns (N Y) Article Adversarial attack transferability is well recognized in deep learning. Previous work has partially explained transferability by recognizing common adversarial subspaces and correlations between decision boundaries, but little is known beyond that. We propose that transferability between seemingly different models is due to a high linear correlation between the feature sets that different networks extract. In other words, two models trained on the same task that are distant in the parameter space likely extract features in the same fashion, linked by trivial affine transformations between the latent spaces. Furthermore, we show how applying a feature correlation loss, which decorrelates the extracted features in corresponding latent spaces, can reduce the transferability of adversarial attacks between models, suggesting that the models complete tasks in semantically different ways. Finally, we propose a dual-neck autoencoder (DNA), which leverages this feature correlation loss to create two meaningfully different encodings of input information with reduced transferability. Elsevier 2022-03-24 /pmc/articles/PMC9122968/ /pubmed/35607626 http://dx.doi.org/10.1016/j.patter.2022.100472 Text en © 2022 The Authors https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
spellingShingle Article
Wiedeman, Christopher
Wang, Ge
Disrupting adversarial transferability in deep neural networks
title Disrupting adversarial transferability in deep neural networks
title_full Disrupting adversarial transferability in deep neural networks
title_fullStr Disrupting adversarial transferability in deep neural networks
title_full_unstemmed Disrupting adversarial transferability in deep neural networks
title_short Disrupting adversarial transferability in deep neural networks
title_sort disrupting adversarial transferability in deep neural networks
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9122968/
https://www.ncbi.nlm.nih.gov/pubmed/35607626
http://dx.doi.org/10.1016/j.patter.2022.100472
work_keys_str_mv AT wiedemanchristopher disruptingadversarialtransferabilityindeepneuralnetworks
AT wangge disruptingadversarialtransferabilityindeepneuralnetworks