Cargando…
Privacy-first health research with federated learning
Privacy protection is paramount in conducting health research. However, studies often rely on data stored in a centralized repository, where analysis is done with full access to the sensitive underlying content. Recent advances in federated learning enable building complex machine-learned models tha...
Autores principales: | , , , , , , , , , , , , , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Nature Publishing Group UK
2021
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8423792/ https://www.ncbi.nlm.nih.gov/pubmed/34493770 http://dx.doi.org/10.1038/s41746-021-00489-2 |
_version_ | 1783749542337839104 |
---|---|
author | Sadilek, Adam Liu, Luyang Nguyen, Dung Kamruzzaman, Methun Serghiou, Stylianos Rader, Benjamin Ingerman, Alex Mellem, Stefan Kairouz, Peter Nsoesie, Elaine O. MacFarlane, Jamie Vullikanti, Anil Marathe, Madhav Eastham, Paul Brownstein, John S. Arcas, Blaise Aguera y. Howell, Michael D. Hernandez, John |
author_facet | Sadilek, Adam Liu, Luyang Nguyen, Dung Kamruzzaman, Methun Serghiou, Stylianos Rader, Benjamin Ingerman, Alex Mellem, Stefan Kairouz, Peter Nsoesie, Elaine O. MacFarlane, Jamie Vullikanti, Anil Marathe, Madhav Eastham, Paul Brownstein, John S. Arcas, Blaise Aguera y. Howell, Michael D. Hernandez, John |
author_sort | Sadilek, Adam |
collection | PubMed |
description | Privacy protection is paramount in conducting health research. However, studies often rely on data stored in a centralized repository, where analysis is done with full access to the sensitive underlying content. Recent advances in federated learning enable building complex machine-learned models that are trained in a distributed fashion. These techniques facilitate the calculation of research study endpoints such that private data never leaves a given device or healthcare system. We show—on a diverse set of single and multi-site health studies—that federated models can achieve similar accuracy, precision, and generalizability, and lead to the same interpretation as standard centralized statistical models while achieving considerably stronger privacy protections and without significantly raising computational costs. This work is the first to apply modern and general federated learning methods that explicitly incorporate differential privacy to clinical and epidemiological research—across a spectrum of units of federation, model architectures, complexity of learning tasks and diseases. As a result, it enables health research participants to remain in control of their data and still contribute to advancing science—aspects that used to be at odds with each other. |
format | Online Article Text |
id | pubmed-8423792 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2021 |
publisher | Nature Publishing Group UK |
record_format | MEDLINE/PubMed |
spelling | pubmed-84237922021-09-14 Privacy-first health research with federated learning Sadilek, Adam Liu, Luyang Nguyen, Dung Kamruzzaman, Methun Serghiou, Stylianos Rader, Benjamin Ingerman, Alex Mellem, Stefan Kairouz, Peter Nsoesie, Elaine O. MacFarlane, Jamie Vullikanti, Anil Marathe, Madhav Eastham, Paul Brownstein, John S. Arcas, Blaise Aguera y. Howell, Michael D. Hernandez, John NPJ Digit Med Article Privacy protection is paramount in conducting health research. However, studies often rely on data stored in a centralized repository, where analysis is done with full access to the sensitive underlying content. Recent advances in federated learning enable building complex machine-learned models that are trained in a distributed fashion. These techniques facilitate the calculation of research study endpoints such that private data never leaves a given device or healthcare system. We show—on a diverse set of single and multi-site health studies—that federated models can achieve similar accuracy, precision, and generalizability, and lead to the same interpretation as standard centralized statistical models while achieving considerably stronger privacy protections and without significantly raising computational costs. This work is the first to apply modern and general federated learning methods that explicitly incorporate differential privacy to clinical and epidemiological research—across a spectrum of units of federation, model architectures, complexity of learning tasks and diseases. As a result, it enables health research participants to remain in control of their data and still contribute to advancing science—aspects that used to be at odds with each other. Nature Publishing Group UK 2021-09-07 /pmc/articles/PMC8423792/ /pubmed/34493770 http://dx.doi.org/10.1038/s41746-021-00489-2 Text en © The Author(s) 2021 https://creativecommons.org/licenses/by/4.0/Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ (https://creativecommons.org/licenses/by/4.0/) . |
spellingShingle | Article Sadilek, Adam Liu, Luyang Nguyen, Dung Kamruzzaman, Methun Serghiou, Stylianos Rader, Benjamin Ingerman, Alex Mellem, Stefan Kairouz, Peter Nsoesie, Elaine O. MacFarlane, Jamie Vullikanti, Anil Marathe, Madhav Eastham, Paul Brownstein, John S. Arcas, Blaise Aguera y. Howell, Michael D. Hernandez, John Privacy-first health research with federated learning |
title | Privacy-first health research with federated learning |
title_full | Privacy-first health research with federated learning |
title_fullStr | Privacy-first health research with federated learning |
title_full_unstemmed | Privacy-first health research with federated learning |
title_short | Privacy-first health research with federated learning |
title_sort | privacy-first health research with federated learning |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8423792/ https://www.ncbi.nlm.nih.gov/pubmed/34493770 http://dx.doi.org/10.1038/s41746-021-00489-2 |
work_keys_str_mv | AT sadilekadam privacyfirsthealthresearchwithfederatedlearning AT liuluyang privacyfirsthealthresearchwithfederatedlearning AT nguyendung privacyfirsthealthresearchwithfederatedlearning AT kamruzzamanmethun privacyfirsthealthresearchwithfederatedlearning AT serghioustylianos privacyfirsthealthresearchwithfederatedlearning AT raderbenjamin privacyfirsthealthresearchwithfederatedlearning AT ingermanalex privacyfirsthealthresearchwithfederatedlearning AT mellemstefan privacyfirsthealthresearchwithfederatedlearning AT kairouzpeter privacyfirsthealthresearchwithfederatedlearning AT nsoesieelaineo privacyfirsthealthresearchwithfederatedlearning AT macfarlanejamie privacyfirsthealthresearchwithfederatedlearning AT vullikantianil privacyfirsthealthresearchwithfederatedlearning AT marathemadhav privacyfirsthealthresearchwithfederatedlearning AT easthampaul privacyfirsthealthresearchwithfederatedlearning AT brownsteinjohns privacyfirsthealthresearchwithfederatedlearning AT arcasblaiseagueray privacyfirsthealthresearchwithfederatedlearning AT howellmichaeld privacyfirsthealthresearchwithfederatedlearning AT hernandezjohn privacyfirsthealthresearchwithfederatedlearning |