Cargando…

On the Asymptotic Capacity of Information-Theoretic Privacy-Preserving Epidemiological Data Collection

The paradigm-shifting developments of cryptography and information theory have focused on the privacy of data-sharing systems, such as epidemiological studies, where agencies are collecting far more personal data than they need, causing intrusions on patients’ privacy. To study the capability of the...

Descripción completa

Detalles Bibliográficos
Autores principales: Cheng, Jiale, Liu, Nan, Kang, Wei
Formato: Online Artículo Texto
Lenguaje:English
Publicado: MDPI 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10137694/
https://www.ncbi.nlm.nih.gov/pubmed/37190413
http://dx.doi.org/10.3390/e25040625
_version_ 1785032528255516672
author Cheng, Jiale
Liu, Nan
Kang, Wei
author_facet Cheng, Jiale
Liu, Nan
Kang, Wei
author_sort Cheng, Jiale
collection PubMed
description The paradigm-shifting developments of cryptography and information theory have focused on the privacy of data-sharing systems, such as epidemiological studies, where agencies are collecting far more personal data than they need, causing intrusions on patients’ privacy. To study the capability of the data collection while protecting privacy from an information theory perspective, we formulate a new distributed multiparty computation problem called privacy-preserving epidemiological data collection. In our setting, a data collector requires a linear combination of K users’ data through a storage system consisting of N servers. Privacy needs to be protected when the users, servers, and data collector do not trust each other. For the users, any data are required to be protected from up to E colluding servers; for the servers, any more information than the desired linear combination cannot be leaked to the data collector; and for the data collector, any single server can not know anything about the coefficients of the linear combination. Our goal is to find the optimal collection rate, which is defined as the ratio of the size of the user’s message to the total size of downloads from N servers to the data collector. For achievability, we propose an asymptotic capacity-achieving scheme when [Formula: see text] , by applying the cross-subspace alignment method to our construction; for the converse, we proved an upper bound of the asymptotic rate for all achievable schemes when [Formula: see text]. Additionally, we show that a positive asymptotic capacity is not possible when [Formula: see text]. The results of the achievability and converse meet when the number of users goes to infinity, yielding the asymptotic capacity. Our work broadens current researches on data privacy in information theory and gives the best achievable asymptotic performance that any epidemiological data collector can obtain.
format Online
Article
Text
id pubmed-10137694
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher MDPI
record_format MEDLINE/PubMed
spelling pubmed-101376942023-04-28 On the Asymptotic Capacity of Information-Theoretic Privacy-Preserving Epidemiological Data Collection Cheng, Jiale Liu, Nan Kang, Wei Entropy (Basel) Article The paradigm-shifting developments of cryptography and information theory have focused on the privacy of data-sharing systems, such as epidemiological studies, where agencies are collecting far more personal data than they need, causing intrusions on patients’ privacy. To study the capability of the data collection while protecting privacy from an information theory perspective, we formulate a new distributed multiparty computation problem called privacy-preserving epidemiological data collection. In our setting, a data collector requires a linear combination of K users’ data through a storage system consisting of N servers. Privacy needs to be protected when the users, servers, and data collector do not trust each other. For the users, any data are required to be protected from up to E colluding servers; for the servers, any more information than the desired linear combination cannot be leaked to the data collector; and for the data collector, any single server can not know anything about the coefficients of the linear combination. Our goal is to find the optimal collection rate, which is defined as the ratio of the size of the user’s message to the total size of downloads from N servers to the data collector. For achievability, we propose an asymptotic capacity-achieving scheme when [Formula: see text] , by applying the cross-subspace alignment method to our construction; for the converse, we proved an upper bound of the asymptotic rate for all achievable schemes when [Formula: see text]. Additionally, we show that a positive asymptotic capacity is not possible when [Formula: see text]. The results of the achievability and converse meet when the number of users goes to infinity, yielding the asymptotic capacity. Our work broadens current researches on data privacy in information theory and gives the best achievable asymptotic performance that any epidemiological data collector can obtain. MDPI 2023-04-06 /pmc/articles/PMC10137694/ /pubmed/37190413 http://dx.doi.org/10.3390/e25040625 Text en © 2023 by the authors. https://creativecommons.org/licenses/by/4.0/Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
spellingShingle Article
Cheng, Jiale
Liu, Nan
Kang, Wei
On the Asymptotic Capacity of Information-Theoretic Privacy-Preserving Epidemiological Data Collection
title On the Asymptotic Capacity of Information-Theoretic Privacy-Preserving Epidemiological Data Collection
title_full On the Asymptotic Capacity of Information-Theoretic Privacy-Preserving Epidemiological Data Collection
title_fullStr On the Asymptotic Capacity of Information-Theoretic Privacy-Preserving Epidemiological Data Collection
title_full_unstemmed On the Asymptotic Capacity of Information-Theoretic Privacy-Preserving Epidemiological Data Collection
title_short On the Asymptotic Capacity of Information-Theoretic Privacy-Preserving Epidemiological Data Collection
title_sort on the asymptotic capacity of information-theoretic privacy-preserving epidemiological data collection
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10137694/
https://www.ncbi.nlm.nih.gov/pubmed/37190413
http://dx.doi.org/10.3390/e25040625
work_keys_str_mv AT chengjiale ontheasymptoticcapacityofinformationtheoreticprivacypreservingepidemiologicaldatacollection
AT liunan ontheasymptoticcapacityofinformationtheoreticprivacypreservingepidemiologicaldatacollection
AT kangwei ontheasymptoticcapacityofinformationtheoreticprivacypreservingepidemiologicaldatacollection