Cargando…
Privacy-preserving construction of generalized linear mixed model for biomedical computation
MOTIVATION: The generalized linear mixed model (GLMM) is an extension of the generalized linear model (GLM) in which the linear predictor takes random effects into account. Given its power of precisely modeling the mixed effects from multiple sources of random variations, the method has been widely...
Autores principales: | , , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2020
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7355231/ https://www.ncbi.nlm.nih.gov/pubmed/32657380 http://dx.doi.org/10.1093/bioinformatics/btaa478 |
_version_ | 1783558232702189568 |
---|---|
author | Zhu, Rui Jiang, Chao Wang, Xiaofeng Wang, Shuang Zheng, Hao Tang, Haixu |
author_facet | Zhu, Rui Jiang, Chao Wang, Xiaofeng Wang, Shuang Zheng, Hao Tang, Haixu |
author_sort | Zhu, Rui |
collection | PubMed |
description | MOTIVATION: The generalized linear mixed model (GLMM) is an extension of the generalized linear model (GLM) in which the linear predictor takes random effects into account. Given its power of precisely modeling the mixed effects from multiple sources of random variations, the method has been widely used in biomedical computation, for instance in the genome-wide association studies (GWASs) that aim to detect genetic variance significantly associated with phenotypes such as human diseases. Collaborative GWAS on large cohorts of patients across multiple institutions is often impeded by the privacy concerns of sharing personal genomic and other health data. To address such concerns, we present in this paper a privacy-preserving Expectation–Maximization (EM) algorithm to build GLMM collaboratively when input data are distributed to multiple participating parties and cannot be transferred to a central server. We assume that the data are horizontally partitioned among participating parties: i.e. each party holds a subset of records (including observational values of fixed effect variables and their corresponding outcome), and for all records, the outcome is regulated by the same set of known fixed effects and random effects. RESULTS: Our collaborative EM algorithm is mathematically equivalent to the original EM algorithm commonly used in GLMM construction. The algorithm also runs efficiently when tested on simulated and real human genomic data, and thus can be practically used for privacy-preserving GLMM construction. We implemented the algorithm for collaborative GLMM (cGLMM) construction in R. The data communication was implemented using the rsocket package. AVAILABILITY AND IMPLEMENTATION: The software is released in open source at https://github.com/huthvincent/cGLMM. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. |
format | Online Article Text |
id | pubmed-7355231 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2020 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-73552312020-07-16 Privacy-preserving construction of generalized linear mixed model for biomedical computation Zhu, Rui Jiang, Chao Wang, Xiaofeng Wang, Shuang Zheng, Hao Tang, Haixu Bioinformatics Genome Privacy and Security MOTIVATION: The generalized linear mixed model (GLMM) is an extension of the generalized linear model (GLM) in which the linear predictor takes random effects into account. Given its power of precisely modeling the mixed effects from multiple sources of random variations, the method has been widely used in biomedical computation, for instance in the genome-wide association studies (GWASs) that aim to detect genetic variance significantly associated with phenotypes such as human diseases. Collaborative GWAS on large cohorts of patients across multiple institutions is often impeded by the privacy concerns of sharing personal genomic and other health data. To address such concerns, we present in this paper a privacy-preserving Expectation–Maximization (EM) algorithm to build GLMM collaboratively when input data are distributed to multiple participating parties and cannot be transferred to a central server. We assume that the data are horizontally partitioned among participating parties: i.e. each party holds a subset of records (including observational values of fixed effect variables and their corresponding outcome), and for all records, the outcome is regulated by the same set of known fixed effects and random effects. RESULTS: Our collaborative EM algorithm is mathematically equivalent to the original EM algorithm commonly used in GLMM construction. The algorithm also runs efficiently when tested on simulated and real human genomic data, and thus can be practically used for privacy-preserving GLMM construction. We implemented the algorithm for collaborative GLMM (cGLMM) construction in R. The data communication was implemented using the rsocket package. AVAILABILITY AND IMPLEMENTATION: The software is released in open source at https://github.com/huthvincent/cGLMM. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online. Oxford University Press 2020-07 2020-07-13 /pmc/articles/PMC7355231/ /pubmed/32657380 http://dx.doi.org/10.1093/bioinformatics/btaa478 Text en © The Author(s) 2020. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/4.0/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com |
spellingShingle | Genome Privacy and Security Zhu, Rui Jiang, Chao Wang, Xiaofeng Wang, Shuang Zheng, Hao Tang, Haixu Privacy-preserving construction of generalized linear mixed model for biomedical computation |
title | Privacy-preserving construction of generalized linear mixed model for biomedical computation |
title_full | Privacy-preserving construction of generalized linear mixed model for biomedical computation |
title_fullStr | Privacy-preserving construction of generalized linear mixed model for biomedical computation |
title_full_unstemmed | Privacy-preserving construction of generalized linear mixed model for biomedical computation |
title_short | Privacy-preserving construction of generalized linear mixed model for biomedical computation |
title_sort | privacy-preserving construction of generalized linear mixed model for biomedical computation |
topic | Genome Privacy and Security |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7355231/ https://www.ncbi.nlm.nih.gov/pubmed/32657380 http://dx.doi.org/10.1093/bioinformatics/btaa478 |
work_keys_str_mv | AT zhurui privacypreservingconstructionofgeneralizedlinearmixedmodelforbiomedicalcomputation AT jiangchao privacypreservingconstructionofgeneralizedlinearmixedmodelforbiomedicalcomputation AT wangxiaofeng privacypreservingconstructionofgeneralizedlinearmixedmodelforbiomedicalcomputation AT wangshuang privacypreservingconstructionofgeneralizedlinearmixedmodelforbiomedicalcomputation AT zhenghao privacypreservingconstructionofgeneralizedlinearmixedmodelforbiomedicalcomputation AT tanghaixu privacypreservingconstructionofgeneralizedlinearmixedmodelforbiomedicalcomputation |