Cargando…

Privacy-preserving construction of generalized linear mixed model for biomedical computation

MOTIVATION: The generalized linear mixed model (GLMM) is an extension of the generalized linear model (GLM) in which the linear predictor takes random effects into account. Given its power of precisely modeling the mixed effects from multiple sources of random variations, the method has been widely...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhu, Rui, Jiang, Chao, Wang, Xiaofeng, Wang, Shuang, Zheng, Hao, Tang, Haixu
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Oxford University Press 2020
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7355231/
https://www.ncbi.nlm.nih.gov/pubmed/32657380
http://dx.doi.org/10.1093/bioinformatics/btaa478
_version_ 1783558232702189568
author Zhu, Rui
Jiang, Chao
Wang, Xiaofeng
Wang, Shuang
Zheng, Hao
Tang, Haixu
author_facet Zhu, Rui
Jiang, Chao
Wang, Xiaofeng
Wang, Shuang
Zheng, Hao
Tang, Haixu
author_sort Zhu, Rui
collection PubMed
description MOTIVATION: The generalized linear mixed model (GLMM) is an extension of the generalized linear model (GLM) in which the linear predictor takes random effects into account. Given its power of precisely modeling the mixed effects from multiple sources of random variations, the method has been widely used in biomedical computation, for instance in the genome-wide association studies (GWASs) that aim to detect genetic variance significantly associated with phenotypes such as human diseases. Collaborative GWAS on large cohorts of patients across multiple institutions is often impeded by the privacy concerns of sharing personal genomic and other health data. To address such concerns, we present in this paper a privacy-preserving Expectation–Maximization (EM) algorithm to build GLMM collaboratively when input data are distributed to multiple participating parties and cannot be transferred to a central server. We assume that the data are horizontally partitioned among participating parties: i.e. each party holds a subset of records (including observational values of fixed effect variables and their corresponding outcome), and for all records, the outcome is regulated by the same set of known fixed effects and random effects. RESULTS: Our collaborative EM algorithm is mathematically equivalent to the original EM algorithm commonly used in GLMM construction. The algorithm also runs efficiently when tested on simulated and real human genomic data, and thus can be practically used for privacy-preserving GLMM construction. We implemented the algorithm for collaborative GLMM (cGLMM) construction in R. The data communication was implemented using the rsocket package. AVAILABILITY AND IMPLEMENTATION: The software is released in open source at https://github.com/huthvincent/cGLMM. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
format Online
Article
Text
id pubmed-7355231
institution National Center for Biotechnology Information
language English
publishDate 2020
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-73552312020-07-16 Privacy-preserving construction of generalized linear mixed model for biomedical computation Zhu, Rui Jiang, Chao Wang, Xiaofeng Wang, Shuang Zheng, Hao Tang, Haixu Bioinformatics Genome Privacy and Security MOTIVATION: The generalized linear mixed model (GLMM) is an extension of the generalized linear model (GLM) in which the linear predictor takes random effects into account. Given its power of precisely modeling the mixed effects from multiple sources of random variations, the method has been widely used in biomedical computation, for instance in the genome-wide association studies (GWASs) that aim to detect genetic variance significantly associated with phenotypes such as human diseases. Collaborative GWAS on large cohorts of patients across multiple institutions is often impeded by the privacy concerns of sharing personal genomic and other health data. To address such concerns, we present in this paper a privacy-preserving Expectation–Maximization (EM) algorithm to build GLMM collaboratively when input data are distributed to multiple participating parties and cannot be transferred to a central server. We assume that the data are horizontally partitioned among participating parties: i.e. each party holds a subset of records (including observational values of fixed effect variables and their corresponding outcome), and for all records, the outcome is regulated by the same set of known fixed effects and random effects. RESULTS: Our collaborative EM algorithm is mathematically equivalent to the original EM algorithm commonly used in GLMM construction. The algorithm also runs efficiently when tested on simulated and real human genomic data, and thus can be practically used for privacy-preserving GLMM construction. We implemented the algorithm for collaborative GLMM (cGLMM) construction in R. The data communication was implemented using the rsocket package. AVAILABILITY AND IMPLEMENTATION: The software is released in open source at https://github.com/huthvincent/cGLMM. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2020-07 2020-07-13 /pmc/articles/PMC7355231/ /pubmed/32657380 http://dx.doi.org/10.1093/bioinformatics/btaa478 Text en © The Author(s) 2020. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com
spellingShingle Genome Privacy and Security
Zhu, Rui
Jiang, Chao
Wang, Xiaofeng
Wang, Shuang
Zheng, Hao
Tang, Haixu
Privacy-preserving construction of generalized linear mixed model for biomedical computation
title Privacy-preserving construction of generalized linear mixed model for biomedical computation
title_full Privacy-preserving construction of generalized linear mixed model for biomedical computation
title_fullStr Privacy-preserving construction of generalized linear mixed model for biomedical computation
title_full_unstemmed Privacy-preserving construction of generalized linear mixed model for biomedical computation
title_short Privacy-preserving construction of generalized linear mixed model for biomedical computation
title_sort privacy-preserving construction of generalized linear mixed model for biomedical computation
topic Genome Privacy and Security
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7355231/
https://www.ncbi.nlm.nih.gov/pubmed/32657380
http://dx.doi.org/10.1093/bioinformatics/btaa478
work_keys_str_mv AT zhurui privacypreservingconstructionofgeneralizedlinearmixedmodelforbiomedicalcomputation
AT jiangchao privacypreservingconstructionofgeneralizedlinearmixedmodelforbiomedicalcomputation
AT wangxiaofeng privacypreservingconstructionofgeneralizedlinearmixedmodelforbiomedicalcomputation
AT wangshuang privacypreservingconstructionofgeneralizedlinearmixedmodelforbiomedicalcomputation
AT zhenghao privacypreservingconstructionofgeneralizedlinearmixedmodelforbiomedicalcomputation
AT tanghaixu privacypreservingconstructionofgeneralizedlinearmixedmodelforbiomedicalcomputation