Cargando…

Web-Based Privacy-Preserving Multicenter Medical Data Analysis Tools Via Threshold Homomorphic Encryption: Design and Development Study

BACKGROUND: Data sharing in multicenter medical research can improve the generalizability of research, accelerate progress, enhance collaborations among institutions, and lead to new discoveries from data pooled from multiple sources. Despite these benefits, many medical institutions are unwilling t...

Descripción completa

Detalles Bibliográficos
Autores principales:	Lu, Yao, Zhou, Tianshu, Tian, Yu, Zhu, Shiqiang, Li, Jingsong
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	JMIR Publications 2020
Materias:	Original Paper
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7755539/ https://www.ncbi.nlm.nih.gov/pubmed/33289676 http://dx.doi.org/10.2196/22555

_version_	1783626368108462080
author	Lu, Yao Zhou, Tianshu Tian, Yu Zhu, Shiqiang Li, Jingsong
author_facet	Lu, Yao Zhou, Tianshu Tian, Yu Zhu, Shiqiang Li, Jingsong
author_sort	Lu, Yao
collection	PubMed
description	BACKGROUND: Data sharing in multicenter medical research can improve the generalizability of research, accelerate progress, enhance collaborations among institutions, and lead to new discoveries from data pooled from multiple sources. Despite these benefits, many medical institutions are unwilling to share their data, as sharing may cause sensitive information to be leaked to researchers, other institutions, and unauthorized users. Great progress has been made in the development of secure machine learning frameworks based on homomorphic encryption in recent years; however, nearly all such frameworks use a single secret key and lack a description of how to securely evaluate the trained model, which makes them impractical for multicenter medical applications. OBJECTIVE: The aim of this study is to provide a privacy-preserving machine learning protocol for multiple data providers and researchers (eg, logistic regression). This protocol allows researchers to train models and then evaluate them on medical data from multiple sources while providing privacy protection for both the sensitive data and the learned model. METHODS: We adapted a novel threshold homomorphic encryption scheme to guarantee privacy requirements. We devised new relinearization key generation techniques for greater scalability and multiplicative depth and new model training strategies for simultaneously training multiple models through x-fold cross-validation. RESULTS: Using a client-server architecture, we evaluated the performance of our protocol. The experimental results demonstrated that, with 10-fold cross-validation, our privacy-preserving logistic regression model training and evaluation over 10 attributes in a data set of 49,152 samples took approximately 7 minutes and 20 minutes, respectively. CONCLUSIONS: We present the first privacy-preserving multiparty logistic regression model training and evaluation protocol based on threshold homomorphic encryption. Our protocol is practical for real-world use and may promote multicenter medical research to some extent.
format	Online Article Text
id	pubmed-7755539
institution	National Center for Biotechnology Information
language	English
publishDate	2020
publisher	JMIR Publications
record_format	MEDLINE/PubMed
spelling	pubmed-77555392020-12-31 Web-Based Privacy-Preserving Multicenter Medical Data Analysis Tools Via Threshold Homomorphic Encryption: Design and Development Study Lu, Yao Zhou, Tianshu Tian, Yu Zhu, Shiqiang Li, Jingsong J Med Internet Res Original Paper BACKGROUND: Data sharing in multicenter medical research can improve the generalizability of research, accelerate progress, enhance collaborations among institutions, and lead to new discoveries from data pooled from multiple sources. Despite these benefits, many medical institutions are unwilling to share their data, as sharing may cause sensitive information to be leaked to researchers, other institutions, and unauthorized users. Great progress has been made in the development of secure machine learning frameworks based on homomorphic encryption in recent years; however, nearly all such frameworks use a single secret key and lack a description of how to securely evaluate the trained model, which makes them impractical for multicenter medical applications. OBJECTIVE: The aim of this study is to provide a privacy-preserving machine learning protocol for multiple data providers and researchers (eg, logistic regression). This protocol allows researchers to train models and then evaluate them on medical data from multiple sources while providing privacy protection for both the sensitive data and the learned model. METHODS: We adapted a novel threshold homomorphic encryption scheme to guarantee privacy requirements. We devised new relinearization key generation techniques for greater scalability and multiplicative depth and new model training strategies for simultaneously training multiple models through x-fold cross-validation. RESULTS: Using a client-server architecture, we evaluated the performance of our protocol. The experimental results demonstrated that, with 10-fold cross-validation, our privacy-preserving logistic regression model training and evaluation over 10 attributes in a data set of 49,152 samples took approximately 7 minutes and 20 minutes, respectively. CONCLUSIONS: We present the first privacy-preserving multiparty logistic regression model training and evaluation protocol based on threshold homomorphic encryption. Our protocol is practical for real-world use and may promote multicenter medical research to some extent. JMIR Publications 2020-12-08 /pmc/articles/PMC7755539/ /pubmed/33289676 http://dx.doi.org/10.2196/22555 Text en ©Yao Lu, Tianshu Zhou, Yu Tian, Shiqiang Zhu, Jingsong Li. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 08.12.2020. https://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.
spellingShingle	Original Paper Lu, Yao Zhou, Tianshu Tian, Yu Zhu, Shiqiang Li, Jingsong Web-Based Privacy-Preserving Multicenter Medical Data Analysis Tools Via Threshold Homomorphic Encryption: Design and Development Study
title	Web-Based Privacy-Preserving Multicenter Medical Data Analysis Tools Via Threshold Homomorphic Encryption: Design and Development Study
title_full	Web-Based Privacy-Preserving Multicenter Medical Data Analysis Tools Via Threshold Homomorphic Encryption: Design and Development Study
title_fullStr	Web-Based Privacy-Preserving Multicenter Medical Data Analysis Tools Via Threshold Homomorphic Encryption: Design and Development Study
title_full_unstemmed	Web-Based Privacy-Preserving Multicenter Medical Data Analysis Tools Via Threshold Homomorphic Encryption: Design and Development Study
title_short	Web-Based Privacy-Preserving Multicenter Medical Data Analysis Tools Via Threshold Homomorphic Encryption: Design and Development Study
title_sort	web-based privacy-preserving multicenter medical data analysis tools via threshold homomorphic encryption: design and development study
topic	Original Paper
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7755539/ https://www.ncbi.nlm.nih.gov/pubmed/33289676 http://dx.doi.org/10.2196/22555
work_keys_str_mv	AT luyao webbasedprivacypreservingmulticentermedicaldataanalysistoolsviathresholdhomomorphicencryptiondesignanddevelopmentstudy AT zhoutianshu webbasedprivacypreservingmulticentermedicaldataanalysistoolsviathresholdhomomorphicencryptiondesignanddevelopmentstudy AT tianyu webbasedprivacypreservingmulticentermedicaldataanalysistoolsviathresholdhomomorphicencryptiondesignanddevelopmentstudy AT zhushiqiang webbasedprivacypreservingmulticentermedicaldataanalysistoolsviathresholdhomomorphicencryptiondesignanddevelopmentstudy AT lijingsong webbasedprivacypreservingmulticentermedicaldataanalysistoolsviathresholdhomomorphicencryptiondesignanddevelopmentstudy

Web-Based Privacy-Preserving Multicenter Medical Data Analysis Tools Via Threshold Homomorphic Encryption: Design and Development Study

Ejemplares similares