Cargando…

Privacy-preserving semi-parallel logistic regression training with fully homomorphic encryption

BACKGROUND: Privacy-preserving computations on genomic data, and more generally on medical data, is a critical path technology for innovative, life-saving research to positively and equally impact the global population. It enables medical research algorithms to be securely deployed in the cloud beca...

Descripción completa

Detalles Bibliográficos
Autores principales: Carpov, Sergiu, Gama, Nicolas, Georgieva, Mariya, Troncoso-Pastoriza, Juan Ramon
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7372765/
https://www.ncbi.nlm.nih.gov/pubmed/32693814
http://dx.doi.org/10.1186/s12920-020-0723-0
_version_ 1783561376791265280
author Carpov, Sergiu
Gama, Nicolas
Georgieva, Mariya
Troncoso-Pastoriza, Juan Ramon
author_facet Carpov, Sergiu
Gama, Nicolas
Georgieva, Mariya
Troncoso-Pastoriza, Juan Ramon
author_sort Carpov, Sergiu
collection PubMed
description BACKGROUND: Privacy-preserving computations on genomic data, and more generally on medical data, is a critical path technology for innovative, life-saving research to positively and equally impact the global population. It enables medical research algorithms to be securely deployed in the cloud because operations on encrypted genomic databases are conducted without revealing any individual genomes. Methods for secure computation have shown significant performance improvements over the last several years. However, it is still challenging to apply them on large biomedical datasets. METHODS: The HE Track of iDash 2018 competition focused on solving an important problem in practical machine learning scenarios, where a data analyst that has trained a regression model (both linear and logistic) with a certain set of features, attempts to find all features in an encrypted database that will improve the quality of the model. Our solution is based on the hybrid framework Chimera that allows for switching between different families of fully homomorphic schemes, namely TFHE and HEAAN. RESULTS: Our solution is one of the finalist of Track 2 of iDash 2018 competition. Among the submitted solutions, ours is the only bootstrapped approach that can be applied for different sets of parameters without re-encrypting the genomic database, making it practical for real-world applications. CONCLUSIONS: This is the first step towards the more general feature selection problem across large encrypted databases.
format Online
Article
Text
id pubmed-7372765
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-73727652020-07-21 Privacy-preserving semi-parallel logistic regression training with fully homomorphic encryption Carpov, Sergiu Gama, Nicolas Georgieva, Mariya Troncoso-Pastoriza, Juan Ramon BMC Med Genomics Research BACKGROUND: Privacy-preserving computations on genomic data, and more generally on medical data, is a critical path technology for innovative, life-saving research to positively and equally impact the global population. It enables medical research algorithms to be securely deployed in the cloud because operations on encrypted genomic databases are conducted without revealing any individual genomes. Methods for secure computation have shown significant performance improvements over the last several years. However, it is still challenging to apply them on large biomedical datasets. METHODS: The HE Track of iDash 2018 competition focused on solving an important problem in practical machine learning scenarios, where a data analyst that has trained a regression model (both linear and logistic) with a certain set of features, attempts to find all features in an encrypted database that will improve the quality of the model. Our solution is based on the hybrid framework Chimera that allows for switching between different families of fully homomorphic schemes, namely TFHE and HEAAN. RESULTS: Our solution is one of the finalist of Track 2 of iDash 2018 competition. Among the submitted solutions, ours is the only bootstrapped approach that can be applied for different sets of parameters without re-encrypting the genomic database, making it practical for real-world applications. CONCLUSIONS: This is the first step towards the more general feature selection problem across large encrypted databases. BioMed Central 2020-07-21 /pmc/articles/PMC7372765/ /pubmed/32693814 http://dx.doi.org/10.1186/s12920-020-0723-0 Text en © The Author(s) 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Research
Carpov, Sergiu
Gama, Nicolas
Georgieva, Mariya
Troncoso-Pastoriza, Juan Ramon
Privacy-preserving semi-parallel logistic regression training with fully homomorphic encryption
title Privacy-preserving semi-parallel logistic regression training with fully homomorphic encryption
title_full Privacy-preserving semi-parallel logistic regression training with fully homomorphic encryption
title_fullStr Privacy-preserving semi-parallel logistic regression training with fully homomorphic encryption
title_full_unstemmed Privacy-preserving semi-parallel logistic regression training with fully homomorphic encryption
title_short Privacy-preserving semi-parallel logistic regression training with fully homomorphic encryption
title_sort privacy-preserving semi-parallel logistic regression training with fully homomorphic encryption
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7372765/
https://www.ncbi.nlm.nih.gov/pubmed/32693814
http://dx.doi.org/10.1186/s12920-020-0723-0
work_keys_str_mv AT carpovsergiu privacypreservingsemiparallellogisticregressiontrainingwithfullyhomomorphicencryption
AT gamanicolas privacypreservingsemiparallellogisticregressiontrainingwithfullyhomomorphicencryption
AT georgievamariya privacypreservingsemiparallellogisticregressiontrainingwithfullyhomomorphicencryption
AT troncosopastorizajuanramon privacypreservingsemiparallellogisticregressiontrainingwithfullyhomomorphicencryption