Cargando…

Privacy‐preserving quality control of neuroimaging datasets in federated environments

Privacy concerns for rare disease data, institutional or IRB policies, access to local computational or storage resources or download capabilities are among the reasons that may preclude analyses that pool data to a single site. A growing number of multisite projects and consortia were formed to fun...

Descripción completa

Detalles Bibliográficos
Autores principales: Saha, Debbrata K., Calhoun, Vince D., Du, Yuhui, Fu, Zening, Kwon, Soo Min, Sarwate, Anand D., Panta, Sandeep R., Plis, Sergey M.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: John Wiley & Sons, Inc. 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8996357/
https://www.ncbi.nlm.nih.gov/pubmed/35243723
http://dx.doi.org/10.1002/hbm.25788
_version_ 1784684474678640640
author Saha, Debbrata K.
Calhoun, Vince D.
Du, Yuhui
Fu, Zening
Kwon, Soo Min
Sarwate, Anand D.
Panta, Sandeep R.
Plis, Sergey M.
author_facet Saha, Debbrata K.
Calhoun, Vince D.
Du, Yuhui
Fu, Zening
Kwon, Soo Min
Sarwate, Anand D.
Panta, Sandeep R.
Plis, Sergey M.
author_sort Saha, Debbrata K.
collection PubMed
description Privacy concerns for rare disease data, institutional or IRB policies, access to local computational or storage resources or download capabilities are among the reasons that may preclude analyses that pool data to a single site. A growing number of multisite projects and consortia were formed to function in the federated environment to conduct productive research under constraints of this kind. In this scenario, a quality control tool that visualizes decentralized data in its entirety via global aggregation of local computations is especially important, as it would allow the screening of samples that cannot be jointly evaluated otherwise. To solve this issue, we present two algorithms: decentralized data stochastic neighbor embedding, dSNE, and its differentially private counterpart, DP‐dSNE. We leverage publicly available datasets to simultaneously map data samples located at different sites according to their similarities. Even though the data never leaves the individual sites, dSNE does not provide any formal privacy guarantees. To overcome that, we rely on differential privacy: a formal mathematical guarantee that protects individuals from being identified as contributors to a dataset. We implement DP‐dSNE with AdaCliP, a method recently proposed to add less noise to the gradients per iteration. We introduce metrics for measuring the embedding quality and validate our algorithms on these metrics against their centralized counterpart on two toy datasets. Our validation on six multisite neuroimaging datasets shows promising results for the quality control tasks of visualization and outlier detection, highlighting the potential of our private, decentralized visualization approach.
format Online
Article
Text
id pubmed-8996357
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher John Wiley & Sons, Inc.
record_format MEDLINE/PubMed
spelling pubmed-89963572022-04-15 Privacy‐preserving quality control of neuroimaging datasets in federated environments Saha, Debbrata K. Calhoun, Vince D. Du, Yuhui Fu, Zening Kwon, Soo Min Sarwate, Anand D. Panta, Sandeep R. Plis, Sergey M. Hum Brain Mapp Research Articles Privacy concerns for rare disease data, institutional or IRB policies, access to local computational or storage resources or download capabilities are among the reasons that may preclude analyses that pool data to a single site. A growing number of multisite projects and consortia were formed to function in the federated environment to conduct productive research under constraints of this kind. In this scenario, a quality control tool that visualizes decentralized data in its entirety via global aggregation of local computations is especially important, as it would allow the screening of samples that cannot be jointly evaluated otherwise. To solve this issue, we present two algorithms: decentralized data stochastic neighbor embedding, dSNE, and its differentially private counterpart, DP‐dSNE. We leverage publicly available datasets to simultaneously map data samples located at different sites according to their similarities. Even though the data never leaves the individual sites, dSNE does not provide any formal privacy guarantees. To overcome that, we rely on differential privacy: a formal mathematical guarantee that protects individuals from being identified as contributors to a dataset. We implement DP‐dSNE with AdaCliP, a method recently proposed to add less noise to the gradients per iteration. We introduce metrics for measuring the embedding quality and validate our algorithms on these metrics against their centralized counterpart on two toy datasets. Our validation on six multisite neuroimaging datasets shows promising results for the quality control tasks of visualization and outlier detection, highlighting the potential of our private, decentralized visualization approach. John Wiley & Sons, Inc. 2022-03-04 /pmc/articles/PMC8996357/ /pubmed/35243723 http://dx.doi.org/10.1002/hbm.25788 Text en © 2022 The Authors. Human Brain Mapping published by Wiley Periodicals LLC. https://creativecommons.org/licenses/by-nc/4.0/This is an open access article under the terms of the http://creativecommons.org/licenses/by-nc/4.0/ (https://creativecommons.org/licenses/by-nc/4.0/) License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited and is not used for commercial purposes.
spellingShingle Research Articles
Saha, Debbrata K.
Calhoun, Vince D.
Du, Yuhui
Fu, Zening
Kwon, Soo Min
Sarwate, Anand D.
Panta, Sandeep R.
Plis, Sergey M.
Privacy‐preserving quality control of neuroimaging datasets in federated environments
title Privacy‐preserving quality control of neuroimaging datasets in federated environments
title_full Privacy‐preserving quality control of neuroimaging datasets in federated environments
title_fullStr Privacy‐preserving quality control of neuroimaging datasets in federated environments
title_full_unstemmed Privacy‐preserving quality control of neuroimaging datasets in federated environments
title_short Privacy‐preserving quality control of neuroimaging datasets in federated environments
title_sort privacy‐preserving quality control of neuroimaging datasets in federated environments
topic Research Articles
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8996357/
https://www.ncbi.nlm.nih.gov/pubmed/35243723
http://dx.doi.org/10.1002/hbm.25788
work_keys_str_mv AT sahadebbratak privacypreservingqualitycontrolofneuroimagingdatasetsinfederatedenvironments
AT calhounvinced privacypreservingqualitycontrolofneuroimagingdatasetsinfederatedenvironments
AT duyuhui privacypreservingqualitycontrolofneuroimagingdatasetsinfederatedenvironments
AT fuzening privacypreservingqualitycontrolofneuroimagingdatasetsinfederatedenvironments
AT kwonsoomin privacypreservingqualitycontrolofneuroimagingdatasetsinfederatedenvironments
AT sarwateanandd privacypreservingqualitycontrolofneuroimagingdatasetsinfederatedenvironments
AT pantasandeepr privacypreservingqualitycontrolofneuroimagingdatasetsinfederatedenvironments
AT plissergeym privacypreservingqualitycontrolofneuroimagingdatasetsinfederatedenvironments