Cargando…
Disrupting adversarial transferability in deep neural networks
Adversarial attack transferability is well recognized in deep learning. Previous work has partially explained transferability by recognizing common adversarial subspaces and correlations between decision boundaries, but little is known beyond that. We propose that transferability between seemingly d...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Elsevier
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9122968/ https://www.ncbi.nlm.nih.gov/pubmed/35607626 http://dx.doi.org/10.1016/j.patter.2022.100472 |
_version_ | 1784711461005688832 |
---|---|
author | Wiedeman, Christopher Wang, Ge |
author_facet | Wiedeman, Christopher Wang, Ge |
author_sort | Wiedeman, Christopher |
collection | PubMed |
description | Adversarial attack transferability is well recognized in deep learning. Previous work has partially explained transferability by recognizing common adversarial subspaces and correlations between decision boundaries, but little is known beyond that. We propose that transferability between seemingly different models is due to a high linear correlation between the feature sets that different networks extract. In other words, two models trained on the same task that are distant in the parameter space likely extract features in the same fashion, linked by trivial affine transformations between the latent spaces. Furthermore, we show how applying a feature correlation loss, which decorrelates the extracted features in corresponding latent spaces, can reduce the transferability of adversarial attacks between models, suggesting that the models complete tasks in semantically different ways. Finally, we propose a dual-neck autoencoder (DNA), which leverages this feature correlation loss to create two meaningfully different encodings of input information with reduced transferability. |
format | Online Article Text |
id | pubmed-9122968 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2022 |
publisher | Elsevier |
record_format | MEDLINE/PubMed |
spelling | pubmed-91229682022-05-22 Disrupting adversarial transferability in deep neural networks Wiedeman, Christopher Wang, Ge Patterns (N Y) Article Adversarial attack transferability is well recognized in deep learning. Previous work has partially explained transferability by recognizing common adversarial subspaces and correlations between decision boundaries, but little is known beyond that. We propose that transferability between seemingly different models is due to a high linear correlation between the feature sets that different networks extract. In other words, two models trained on the same task that are distant in the parameter space likely extract features in the same fashion, linked by trivial affine transformations between the latent spaces. Furthermore, we show how applying a feature correlation loss, which decorrelates the extracted features in corresponding latent spaces, can reduce the transferability of adversarial attacks between models, suggesting that the models complete tasks in semantically different ways. Finally, we propose a dual-neck autoencoder (DNA), which leverages this feature correlation loss to create two meaningfully different encodings of input information with reduced transferability. Elsevier 2022-03-24 /pmc/articles/PMC9122968/ /pubmed/35607626 http://dx.doi.org/10.1016/j.patter.2022.100472 Text en © 2022 The Authors https://creativecommons.org/licenses/by-nc-nd/4.0/This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). |
spellingShingle | Article Wiedeman, Christopher Wang, Ge Disrupting adversarial transferability in deep neural networks |
title | Disrupting adversarial transferability in deep neural networks |
title_full | Disrupting adversarial transferability in deep neural networks |
title_fullStr | Disrupting adversarial transferability in deep neural networks |
title_full_unstemmed | Disrupting adversarial transferability in deep neural networks |
title_short | Disrupting adversarial transferability in deep neural networks |
title_sort | disrupting adversarial transferability in deep neural networks |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9122968/ https://www.ncbi.nlm.nih.gov/pubmed/35607626 http://dx.doi.org/10.1016/j.patter.2022.100472 |
work_keys_str_mv | AT wiedemanchristopher disruptingadversarialtransferabilityindeepneuralnetworks AT wangge disruptingadversarialtransferabilityindeepneuralnetworks |