Cargando…

More practical differentially private publication of key statistics in GWAS

 : Motivation: Analyses of datasets that contain personal genomic information are very important for revealing associations between diseases and genomes. Genome-wide association studies, which are large-scale genetic statistical analyses, often involve tests with contingency tables. However, if the...

Descripción completa

Detalles Bibliográficos
Autores principales: Yamamoto, Akito, Shibuya, Tetsuo
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9710635/
https://www.ncbi.nlm.nih.gov/pubmed/36700105
http://dx.doi.org/10.1093/bioadv/vbab004
_version_ 1784841409130397696
author Yamamoto, Akito
Shibuya, Tetsuo
author_facet Yamamoto, Akito
Shibuya, Tetsuo
author_sort Yamamoto, Akito
collection PubMed
description  : Motivation: Analyses of datasets that contain personal genomic information are very important for revealing associations between diseases and genomes. Genome-wide association studies, which are large-scale genetic statistical analyses, often involve tests with contingency tables. However, if the statistics obtained by these tests are made public as they are, sensitive information of individuals could be leaked. Existing studies have proposed privacy-preserving methods for statistics in the χ(2) test with a 3 × 2 contingency table, but they do not cover all the tests used in association studies. In addition, existing methods for releasing differentially private P-values are not practical. Results: In this work, we propose methods for releasing statistics in the χ(2) test, the Fisher’s exact test and the Cochran–Armitage’s trend test while preserving both personal privacy and utility. Our methods for releasing P-values are the first to achieve practicality under the concept of differential privacy by considering their base 10 logarithms. We make theoretical guarantees by showing the sensitivity of the above statistics. From our experimental results, we evaluate the utility of the proposed methods and show appropriate thresholds with high accuracy for using the private statistics in actual tests. AVAILABILITY AND IMPLEMENTATION: A python implementation of our experiments is available at https://github.com/ay0408/DP-statistics-GWAS. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online.
format Online
Article
Text
id pubmed-9710635
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-97106352023-01-24 More practical differentially private publication of key statistics in GWAS Yamamoto, Akito Shibuya, Tetsuo Bioinform Adv Original Article  : Motivation: Analyses of datasets that contain personal genomic information are very important for revealing associations between diseases and genomes. Genome-wide association studies, which are large-scale genetic statistical analyses, often involve tests with contingency tables. However, if the statistics obtained by these tests are made public as they are, sensitive information of individuals could be leaked. Existing studies have proposed privacy-preserving methods for statistics in the χ(2) test with a 3 × 2 contingency table, but they do not cover all the tests used in association studies. In addition, existing methods for releasing differentially private P-values are not practical. Results: In this work, we propose methods for releasing statistics in the χ(2) test, the Fisher’s exact test and the Cochran–Armitage’s trend test while preserving both personal privacy and utility. Our methods for releasing P-values are the first to achieve practicality under the concept of differential privacy by considering their base 10 logarithms. We make theoretical guarantees by showing the sensitivity of the above statistics. From our experimental results, we evaluate the utility of the proposed methods and show appropriate thresholds with high accuracy for using the private statistics in actual tests. AVAILABILITY AND IMPLEMENTATION: A python implementation of our experiments is available at https://github.com/ay0408/DP-statistics-GWAS. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics Advances online. Oxford University Press 2021-05-18 /pmc/articles/PMC9710635/ /pubmed/36700105 http://dx.doi.org/10.1093/bioadv/vbab004 Text en © The Author(s) 2021. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Article
Yamamoto, Akito
Shibuya, Tetsuo
More practical differentially private publication of key statistics in GWAS
title More practical differentially private publication of key statistics in GWAS
title_full More practical differentially private publication of key statistics in GWAS
title_fullStr More practical differentially private publication of key statistics in GWAS
title_full_unstemmed More practical differentially private publication of key statistics in GWAS
title_short More practical differentially private publication of key statistics in GWAS
title_sort more practical differentially private publication of key statistics in gwas
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9710635/
https://www.ncbi.nlm.nih.gov/pubmed/36700105
http://dx.doi.org/10.1093/bioadv/vbab004
work_keys_str_mv AT yamamotoakito morepracticaldifferentiallyprivatepublicationofkeystatisticsingwas
AT shibuyatetsuo morepracticaldifferentiallyprivatepublicationofkeystatisticsingwas