Cargando…

A Textual Backdoor Defense Method Based on Deep Feature Classification

Natural language processing (NLP) models based on deep neural networks (DNNs) are vulnerable to backdoor attacks. Existing backdoor defense methods have limited effectiveness and coverage scenarios. We propose a textual backdoor defense method based on deep feature classification. The method include...

Descripción completa

Detalles Bibliográficos
Autores principales:	Shao, Kun, Yang, Junan, Hu, Pengjiang, Li, Xiaoshuai
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	MDPI 2023
Materias:	Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9955932/ https://www.ncbi.nlm.nih.gov/pubmed/36832587 http://dx.doi.org/10.3390/e25020220

_version_	1784894468459069440
author	Shao, Kun Yang, Junan Hu, Pengjiang Li, Xiaoshuai
author_facet	Shao, Kun Yang, Junan Hu, Pengjiang Li, Xiaoshuai
author_sort	Shao, Kun
collection	PubMed
description	Natural language processing (NLP) models based on deep neural networks (DNNs) are vulnerable to backdoor attacks. Existing backdoor defense methods have limited effectiveness and coverage scenarios. We propose a textual backdoor defense method based on deep feature classification. The method includes deep feature extraction and classifier construction. The method exploits the distinguishability of deep features of poisoned data and benign data. Backdoor defense is implemented in both offline and online scenarios. We conducted defense experiments on two datasets and two models for a variety of backdoor attacks. The experimental results demonstrate the effectiveness of this defense approach and outperform the baseline defense method.
format	Online Article Text
id	pubmed-9955932
institution	National Center for Biotechnology Information
language	English
publishDate	2023
publisher	MDPI
record_format	MEDLINE/PubMed
spelling	pubmed-99559322023-02-25 A Textual Backdoor Defense Method Based on Deep Feature Classification Shao, Kun Yang, Junan Hu, Pengjiang Li, Xiaoshuai Entropy (Basel) Article Natural language processing (NLP) models based on deep neural networks (DNNs) are vulnerable to backdoor attacks. Existing backdoor defense methods have limited effectiveness and coverage scenarios. We propose a textual backdoor defense method based on deep feature classification. The method includes deep feature extraction and classifier construction. The method exploits the distinguishability of deep features of poisoned data and benign data. Backdoor defense is implemented in both offline and online scenarios. We conducted defense experiments on two datasets and two models for a variety of backdoor attacks. The experimental results demonstrate the effectiveness of this defense approach and outperform the baseline defense method. MDPI 2023-01-23 /pmc/articles/PMC9955932/ /pubmed/36832587 http://dx.doi.org/10.3390/e25020220 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle	Article Shao, Kun Yang, Junan Hu, Pengjiang Li, Xiaoshuai A Textual Backdoor Defense Method Based on Deep Feature Classification
title	A Textual Backdoor Defense Method Based on Deep Feature Classification
title_full	A Textual Backdoor Defense Method Based on Deep Feature Classification
title_fullStr	A Textual Backdoor Defense Method Based on Deep Feature Classification
title_full_unstemmed	A Textual Backdoor Defense Method Based on Deep Feature Classification
title_short	A Textual Backdoor Defense Method Based on Deep Feature Classification
title_sort	textual backdoor defense method based on deep feature classification
topic	Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9955932/ https://www.ncbi.nlm.nih.gov/pubmed/36832587 http://dx.doi.org/10.3390/e25020220
work_keys_str_mv	AT shaokun atextualbackdoordefensemethodbasedondeepfeatureclassification AT yangjunan atextualbackdoordefensemethodbasedondeepfeatureclassification AT hupengjiang atextualbackdoordefensemethodbasedondeepfeatureclassification AT lixiaoshuai atextualbackdoordefensemethodbasedondeepfeatureclassification AT shaokun textualbackdoordefensemethodbasedondeepfeatureclassification AT yangjunan textualbackdoordefensemethodbasedondeepfeatureclassification AT hupengjiang textualbackdoordefensemethodbasedondeepfeatureclassification AT lixiaoshuai textualbackdoordefensemethodbasedondeepfeatureclassification

A Textual Backdoor Defense Method Based on Deep Feature Classification

Ejemplares similares