Cargando…

K-Anonymity Privacy Protection Algorithm for Multi-Dimensional Data against Skewness and Similarity Attacks

Currently, a significant focus has been established on the privacy protection of multi-dimensional data publishing in various application scenarios, such as scientific research and policy-making. The K-anonymity mechanism based on clustering is the main method of shared-data desensitization, but it...

Descripción completa

Detalles Bibliográficos
Autores principales: Su, Bing, Huang, Jiaxuan, Miao, Kelei, Wang, Zhangquan, Zhang, Xudong, Chen, Yourong
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9919945/
https://www.ncbi.nlm.nih.gov/pubmed/36772594
http://dx.doi.org/10.3390/s23031554
_version_ 1784886948928684032
author Su, Bing
Huang, Jiaxuan
Miao, Kelei
Wang, Zhangquan
Zhang, Xudong
Chen, Yourong
author_facet Su, Bing
Huang, Jiaxuan
Miao, Kelei
Wang, Zhangquan
Zhang, Xudong
Chen, Yourong
author_sort Su, Bing
collection PubMed
description Currently, a significant focus has been established on the privacy protection of multi-dimensional data publishing in various application scenarios, such as scientific research and policy-making. The K-anonymity mechanism based on clustering is the main method of shared-data desensitization, but it will cause problems of inconsistent clustering results and low clustering accuracy. It also cannot defend against several common attacks, such as skewness and similarity attacks at the same time. To defend against these attacks, we propose a K-anonymity privacy protection algorithm for multi-dimensional data against skewness and similarity attacks (KAPP) combined with t-closeness. Firstly, we propose a multi-dimensional sensitive data clustering algorithm based on improved African vultures optimization. More specifically, we improve the initialization, fitness calculation, and solution update strategy of the clustering center. The improved African vultures optimization can provide the optimal solution with various dimensions and achieve highly accurate clustering of the multi-dimensional dataset based on multiple sensitive attributes. It ensures that multi-dimensional data of different clusters are different in sensitive data. After the dataset anonymization, similar sensitive data of the same equivalence class will become less, and it eventually does not satisfy the premise of being theft by skewness and similarity attacks. We also propose an equivalence class partition method based on the sensitive data distribution difference value measurement and t-closeness. Namely, we calculate the sensitive data distribution’s difference value of each equivalence class and then combine the equivalence classes with larger difference values. Each equivalence class satisfies t-closeness. This method can ensure that multi-dimensional data of the same equivalence class are different in multiple sensitive attributes, and thus can effectively defend against skewness and similarity attacks. Moreover, we generalize sensitive attributes with significant weight and all quasi-identifier attributes to achieve anonymous protection of the dataset. The experimental results show that KAPP improves clustering accuracy, diversity, and anonymity compared to other similar methods under skewness and similarity attacks.
format Online
Article
Text
id pubmed-9919945
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-99199452023-02-12 K-Anonymity Privacy Protection Algorithm for Multi-Dimensional Data against Skewness and Similarity Attacks Su, Bing Huang, Jiaxuan Miao, Kelei Wang, Zhangquan Zhang, Xudong Chen, Yourong Sensors (Basel) Article Currently, a significant focus has been established on the privacy protection of multi-dimensional data publishing in various application scenarios, such as scientific research and policy-making. The K-anonymity mechanism based on clustering is the main method of shared-data desensitization, but it will cause problems of inconsistent clustering results and low clustering accuracy. It also cannot defend against several common attacks, such as skewness and similarity attacks at the same time. To defend against these attacks, we propose a K-anonymity privacy protection algorithm for multi-dimensional data against skewness and similarity attacks (KAPP) combined with t-closeness. Firstly, we propose a multi-dimensional sensitive data clustering algorithm based on improved African vultures optimization. More specifically, we improve the initialization, fitness calculation, and solution update strategy of the clustering center. The improved African vultures optimization can provide the optimal solution with various dimensions and achieve highly accurate clustering of the multi-dimensional dataset based on multiple sensitive attributes. It ensures that multi-dimensional data of different clusters are different in sensitive data. After the dataset anonymization, similar sensitive data of the same equivalence class will become less, and it eventually does not satisfy the premise of being theft by skewness and similarity attacks. We also propose an equivalence class partition method based on the sensitive data distribution difference value measurement and t-closeness. Namely, we calculate the sensitive data distribution’s difference value of each equivalence class and then combine the equivalence classes with larger difference values. Each equivalence class satisfies t-closeness. This method can ensure that multi-dimensional data of the same equivalence class are different in multiple sensitive attributes, and thus can effectively defend against skewness and similarity attacks. Moreover, we generalize sensitive attributes with significant weight and all quasi-identifier attributes to achieve anonymous protection of the dataset. The experimental results show that KAPP improves clustering accuracy, diversity, and anonymity compared to other similar methods under skewness and similarity attacks. MDPI 2023-01-31 /pmc/articles/PMC9919945/ /pubmed/36772594 http://dx.doi.org/10.3390/s23031554 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Su, Bing
Huang, Jiaxuan
Miao, Kelei
Wang, Zhangquan
Zhang, Xudong
Chen, Yourong
K-Anonymity Privacy Protection Algorithm for Multi-Dimensional Data against Skewness and Similarity Attacks
title K-Anonymity Privacy Protection Algorithm for Multi-Dimensional Data against Skewness and Similarity Attacks
title_full K-Anonymity Privacy Protection Algorithm for Multi-Dimensional Data against Skewness and Similarity Attacks
title_fullStr K-Anonymity Privacy Protection Algorithm for Multi-Dimensional Data against Skewness and Similarity Attacks
title_full_unstemmed K-Anonymity Privacy Protection Algorithm for Multi-Dimensional Data against Skewness and Similarity Attacks
title_short K-Anonymity Privacy Protection Algorithm for Multi-Dimensional Data against Skewness and Similarity Attacks
title_sort k-anonymity privacy protection algorithm for multi-dimensional data against skewness and similarity attacks
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9919945/
https://www.ncbi.nlm.nih.gov/pubmed/36772594
http://dx.doi.org/10.3390/s23031554
work_keys_str_mv AT subing kanonymityprivacyprotectionalgorithmformultidimensionaldataagainstskewnessandsimilarityattacks
AT huangjiaxuan kanonymityprivacyprotectionalgorithmformultidimensionaldataagainstskewnessandsimilarityattacks
AT miaokelei kanonymityprivacyprotectionalgorithmformultidimensionaldataagainstskewnessandsimilarityattacks
AT wangzhangquan kanonymityprivacyprotectionalgorithmformultidimensionaldataagainstskewnessandsimilarityattacks
AT zhangxudong kanonymityprivacyprotectionalgorithmformultidimensionaldataagainstskewnessandsimilarityattacks
AT chenyourong kanonymityprivacyprotectionalgorithmformultidimensionaldataagainstskewnessandsimilarityattacks