Cargando…

A privacy-preserving and computation-efficient federated algorithm for generalized linear mixed models to analyze correlated electronic health records data

Large collaborative research networks provide opportunities to jointly analyze multicenter electronic health record (EHR) data, which can improve the sample size, diversity of the study population, and generalizability of the results. However, there are challenges to analyzing multicenter EHR data i...

Descripción completa

Detalles Bibliográficos
Autores principales: Yan, Zhiyu, Zachrison, Kori S., Schwamm, Lee H., Estrada, Juan J., Duan, Rui
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2023
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9844867/
https://www.ncbi.nlm.nih.gov/pubmed/36649349
http://dx.doi.org/10.1371/journal.pone.0280192
_version_ 1784870752771637248
author Yan, Zhiyu
Zachrison, Kori S.
Schwamm, Lee H.
Estrada, Juan J.
Duan, Rui
author_facet Yan, Zhiyu
Zachrison, Kori S.
Schwamm, Lee H.
Estrada, Juan J.
Duan, Rui
author_sort Yan, Zhiyu
collection PubMed
description Large collaborative research networks provide opportunities to jointly analyze multicenter electronic health record (EHR) data, which can improve the sample size, diversity of the study population, and generalizability of the results. However, there are challenges to analyzing multicenter EHR data including privacy protection, large-scale computation resource requirements, heterogeneity across sites, and correlated observations. In this paper, we propose a federated algorithm for generalized linear mixed models (Fed-GLMM), which can flexibly model multicenter longitudinal or correlated data while accounting for site-level heterogeneity. Fed-GLMM can be applied to both federated and centralized research networks to enable privacy-preserving data integration and improve computational efficiency. By communicating a limited amount of summary statistics, Fed-GLMM can achieve nearly identical results as the gold-standard method where the GLMM is directly fitted to the pooled dataset. We demonstrate the performance of Fed-GLMM in numerical experiments and an application to longitudinal EHR data from multiple healthcare facilities.
format Online
Article
Text
id pubmed-9844867
institution National Center for Biotechnology Information
language English
publishDate 2023
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-98448672023-01-18 A privacy-preserving and computation-efficient federated algorithm for generalized linear mixed models to analyze correlated electronic health records data Yan, Zhiyu Zachrison, Kori S. Schwamm, Lee H. Estrada, Juan J. Duan, Rui PLoS One Research Article Large collaborative research networks provide opportunities to jointly analyze multicenter electronic health record (EHR) data, which can improve the sample size, diversity of the study population, and generalizability of the results. However, there are challenges to analyzing multicenter EHR data including privacy protection, large-scale computation resource requirements, heterogeneity across sites, and correlated observations. In this paper, we propose a federated algorithm for generalized linear mixed models (Fed-GLMM), which can flexibly model multicenter longitudinal or correlated data while accounting for site-level heterogeneity. Fed-GLMM can be applied to both federated and centralized research networks to enable privacy-preserving data integration and improve computational efficiency. By communicating a limited amount of summary statistics, Fed-GLMM can achieve nearly identical results as the gold-standard method where the GLMM is directly fitted to the pooled dataset. We demonstrate the performance of Fed-GLMM in numerical experiments and an application to longitudinal EHR data from multiple healthcare facilities. Public Library of Science 2023-01-17 /pmc/articles/PMC9844867/ /pubmed/36649349 http://dx.doi.org/10.1371/journal.pone.0280192 Text en © 2023 Yan et al https://creativecommons.org/licenses/by/4.0/This is an open access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/) , which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
spellingShingle Research Article
Yan, Zhiyu
Zachrison, Kori S.
Schwamm, Lee H.
Estrada, Juan J.
Duan, Rui
A privacy-preserving and computation-efficient federated algorithm for generalized linear mixed models to analyze correlated electronic health records data
title A privacy-preserving and computation-efficient federated algorithm for generalized linear mixed models to analyze correlated electronic health records data
title_full A privacy-preserving and computation-efficient federated algorithm for generalized linear mixed models to analyze correlated electronic health records data
title_fullStr A privacy-preserving and computation-efficient federated algorithm for generalized linear mixed models to analyze correlated electronic health records data
title_full_unstemmed A privacy-preserving and computation-efficient federated algorithm for generalized linear mixed models to analyze correlated electronic health records data
title_short A privacy-preserving and computation-efficient federated algorithm for generalized linear mixed models to analyze correlated electronic health records data
title_sort privacy-preserving and computation-efficient federated algorithm for generalized linear mixed models to analyze correlated electronic health records data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9844867/
https://www.ncbi.nlm.nih.gov/pubmed/36649349
http://dx.doi.org/10.1371/journal.pone.0280192
work_keys_str_mv AT yanzhiyu aprivacypreservingandcomputationefficientfederatedalgorithmforgeneralizedlinearmixedmodelstoanalyzecorrelatedelectronichealthrecordsdata
AT zachrisonkoris aprivacypreservingandcomputationefficientfederatedalgorithmforgeneralizedlinearmixedmodelstoanalyzecorrelatedelectronichealthrecordsdata
AT schwammleeh aprivacypreservingandcomputationefficientfederatedalgorithmforgeneralizedlinearmixedmodelstoanalyzecorrelatedelectronichealthrecordsdata
AT estradajuanj aprivacypreservingandcomputationefficientfederatedalgorithmforgeneralizedlinearmixedmodelstoanalyzecorrelatedelectronichealthrecordsdata
AT duanrui aprivacypreservingandcomputationefficientfederatedalgorithmforgeneralizedlinearmixedmodelstoanalyzecorrelatedelectronichealthrecordsdata
AT yanzhiyu privacypreservingandcomputationefficientfederatedalgorithmforgeneralizedlinearmixedmodelstoanalyzecorrelatedelectronichealthrecordsdata
AT zachrisonkoris privacypreservingandcomputationefficientfederatedalgorithmforgeneralizedlinearmixedmodelstoanalyzecorrelatedelectronichealthrecordsdata
AT schwammleeh privacypreservingandcomputationefficientfederatedalgorithmforgeneralizedlinearmixedmodelstoanalyzecorrelatedelectronichealthrecordsdata
AT estradajuanj privacypreservingandcomputationefficientfederatedalgorithmforgeneralizedlinearmixedmodelstoanalyzecorrelatedelectronichealthrecordsdata
AT duanrui privacypreservingandcomputationefficientfederatedalgorithmforgeneralizedlinearmixedmodelstoanalyzecorrelatedelectronichealthrecordsdata