Cargando…

Robust fingerprinting of genomic databases

MOTIVATION: Database fingerprinting has been widely used to discourage unauthorized redistribution of data by providing means to identify the source of data leakages. However, there is no fingerprinting scheme aiming at achieving liability guarantees when sharing genomic databases. Thus, we are moti...

Descripción completa

Detalles Bibliográficos
Autores principales: Ji, Tianxi, Ayday, Erman, Yilmaz, Emre, Li, Pan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2022
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9236581/
https://www.ncbi.nlm.nih.gov/pubmed/35758787
http://dx.doi.org/10.1093/bioinformatics/btac243
_version_ 1784736565669396480
author Ji, Tianxi
Ayday, Erman
Yilmaz, Emre
Li, Pan
author_facet Ji, Tianxi
Ayday, Erman
Yilmaz, Emre
Li, Pan
author_sort Ji, Tianxi
collection PubMed
description MOTIVATION: Database fingerprinting has been widely used to discourage unauthorized redistribution of data by providing means to identify the source of data leakages. However, there is no fingerprinting scheme aiming at achieving liability guarantees when sharing genomic databases. Thus, we are motivated to fill in this gap by devising a vanilla fingerprinting scheme specifically for genomic databases. Moreover, since malicious genomic database recipients may compromise the embedded fingerprint (distort the steganographic marks, i.e. the embedded fingerprint bit-string) by launching effective correlation attacks, which leverage the intrinsic correlations among genomic data (e.g. Mendel’s law and linkage disequilibrium), we also augment the vanilla scheme by developing mitigation techniques to achieve robust fingerprinting of genomic databases against correlation attacks. RESULTS: Via experiments using a real-world genomic database, we first show that correlation attacks against fingerprinting schemes for genomic databases are very powerful. In particular, the correlation attacks can distort more than half of the fingerprint bits by causing a small utility loss (e.g. database accuracy and consistency of SNP–phenotype associations measured via P-values). Next, we experimentally show that the correlation attacks can be effectively mitigated by our proposed mitigation techniques. We validate that the attacker can hardly compromise a large portion of the fingerprint bits even if it pays a higher cost in terms of degradation of the database utility. For example, with around 24% loss in accuracy and 20% loss in the consistency of SNP–phenotype associations, the attacker can only distort about 30% fingerprint bits, which is insufficient for it to avoid being accused. We also show that the proposed mitigation techniques also preserve the utility of the shared genomic databases, e.g. the mitigation techniques only lead to around 3% loss in accuracy. AVAILABILITY AND IMPLEMENTATION: https://github.com/xiutianxi/robust-genomic-fp-github.
format Online
Article
Text
id pubmed-9236581
institution National Center for Biotechnology Information
language English
publishDate 2022
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-92365812022-06-29 Robust fingerprinting of genomic databases Ji, Tianxi Ayday, Erman Yilmaz, Emre Li, Pan Bioinformatics ISCB/Ismb 2022 MOTIVATION: Database fingerprinting has been widely used to discourage unauthorized redistribution of data by providing means to identify the source of data leakages. However, there is no fingerprinting scheme aiming at achieving liability guarantees when sharing genomic databases. Thus, we are motivated to fill in this gap by devising a vanilla fingerprinting scheme specifically for genomic databases. Moreover, since malicious genomic database recipients may compromise the embedded fingerprint (distort the steganographic marks, i.e. the embedded fingerprint bit-string) by launching effective correlation attacks, which leverage the intrinsic correlations among genomic data (e.g. Mendel’s law and linkage disequilibrium), we also augment the vanilla scheme by developing mitigation techniques to achieve robust fingerprinting of genomic databases against correlation attacks. RESULTS: Via experiments using a real-world genomic database, we first show that correlation attacks against fingerprinting schemes for genomic databases are very powerful. In particular, the correlation attacks can distort more than half of the fingerprint bits by causing a small utility loss (e.g. database accuracy and consistency of SNP–phenotype associations measured via P-values). Next, we experimentally show that the correlation attacks can be effectively mitigated by our proposed mitigation techniques. We validate that the attacker can hardly compromise a large portion of the fingerprint bits even if it pays a higher cost in terms of degradation of the database utility. For example, with around 24% loss in accuracy and 20% loss in the consistency of SNP–phenotype associations, the attacker can only distort about 30% fingerprint bits, which is insufficient for it to avoid being accused. We also show that the proposed mitigation techniques also preserve the utility of the shared genomic databases, e.g. the mitigation techniques only lead to around 3% loss in accuracy. AVAILABILITY AND IMPLEMENTATION: https://github.com/xiutianxi/robust-genomic-fp-github. Oxford University Press 2022-06-27 /pmc/articles/PMC9236581/ /pubmed/35758787 http://dx.doi.org/10.1093/bioinformatics/btac243 Text en © The Author(s) 2022. Published by Oxford University Press. https://creativecommons.org/licenses/by/4.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle ISCB/Ismb 2022
Ji, Tianxi
Ayday, Erman
Yilmaz, Emre
Li, Pan
Robust fingerprinting of genomic databases
title Robust fingerprinting of genomic databases
title_full Robust fingerprinting of genomic databases
title_fullStr Robust fingerprinting of genomic databases
title_full_unstemmed Robust fingerprinting of genomic databases
title_short Robust fingerprinting of genomic databases
title_sort robust fingerprinting of genomic databases
topic ISCB/Ismb 2022
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9236581/
https://www.ncbi.nlm.nih.gov/pubmed/35758787
http://dx.doi.org/10.1093/bioinformatics/btac243
work_keys_str_mv AT jitianxi robustfingerprintingofgenomicdatabases
AT aydayerman robustfingerprintingofgenomicdatabases
AT yilmazemre robustfingerprintingofgenomicdatabases
AT lipan robustfingerprintingofgenomicdatabases