Cargando…

Coupled mixed model for joint genetic analysis of complex disorders with two independently collected data sets

BACKGROUND: In the last decade, Genome-wide Association studies (GWASs) have contributed to decoding the human genome by uncovering many genetic variations associated with various diseases. Many follow-up investigations involve joint analysis of multiple independently generated GWAS data sets. While...

Descripción completa

Detalles Bibliográficos
Autores principales: Wang, Haohan, Pei, Fen, Vanyukov, Michael M., Bahar, Ivet, Wu, Wei, Xing, Eric P.
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2021
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7866684/
https://www.ncbi.nlm.nih.gov/pubmed/33546598
http://dx.doi.org/10.1186/s12859-021-03959-2
_version_ 1783648131657760768
author Wang, Haohan
Pei, Fen
Vanyukov, Michael M.
Bahar, Ivet
Wu, Wei
Xing, Eric P.
author_facet Wang, Haohan
Pei, Fen
Vanyukov, Michael M.
Bahar, Ivet
Wu, Wei
Xing, Eric P.
author_sort Wang, Haohan
collection PubMed
description BACKGROUND: In the last decade, Genome-wide Association studies (GWASs) have contributed to decoding the human genome by uncovering many genetic variations associated with various diseases. Many follow-up investigations involve joint analysis of multiple independently generated GWAS data sets. While most of the computational approaches developed for joint analysis are based on summary statistics, the joint analysis based on individual-level data with consideration of confounding factors remains to be a challenge. RESULTS: In this study, we propose a method, called Coupled Mixed Model (CMM), that enables a joint GWAS analysis on two independently collected sets of GWAS data with different phenotypes. The CMM method does not require the data sets to have the same phenotypes as it aims to infer the unknown phenotypes using a set of multivariate sparse mixed models. Moreover, CMM addresses the confounding variables due to population stratification, family structures, and cryptic relatedness, as well as those arising during data collection such as batch effects that frequently appear in joint genetic studies. We evaluate the performance of CMM using simulation experiments. In real data analysis, we illustrate the utility of CMM by an application to evaluating common genetic associations for Alzheimer’s disease and substance use disorder using datasets independently collected for the two complex human disorders. Comparison of the results with those from previous experiments and analyses supports the utility of our method and provides new insights into the diseases. The software is available at https://github.com/HaohanWang/CMM.
format Online
Article
Text
id pubmed-7866684
institution National Center for Biotechnology Information
language English
publishDate 2021
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-78666842021-02-08 Coupled mixed model for joint genetic analysis of complex disorders with two independently collected data sets Wang, Haohan Pei, Fen Vanyukov, Michael M. Bahar, Ivet Wu, Wei Xing, Eric P. BMC Bioinformatics Methodology Article BACKGROUND: In the last decade, Genome-wide Association studies (GWASs) have contributed to decoding the human genome by uncovering many genetic variations associated with various diseases. Many follow-up investigations involve joint analysis of multiple independently generated GWAS data sets. While most of the computational approaches developed for joint analysis are based on summary statistics, the joint analysis based on individual-level data with consideration of confounding factors remains to be a challenge. RESULTS: In this study, we propose a method, called Coupled Mixed Model (CMM), that enables a joint GWAS analysis on two independently collected sets of GWAS data with different phenotypes. The CMM method does not require the data sets to have the same phenotypes as it aims to infer the unknown phenotypes using a set of multivariate sparse mixed models. Moreover, CMM addresses the confounding variables due to population stratification, family structures, and cryptic relatedness, as well as those arising during data collection such as batch effects that frequently appear in joint genetic studies. We evaluate the performance of CMM using simulation experiments. In real data analysis, we illustrate the utility of CMM by an application to evaluating common genetic associations for Alzheimer’s disease and substance use disorder using datasets independently collected for the two complex human disorders. Comparison of the results with those from previous experiments and analyses supports the utility of our method and provides new insights into the diseases. The software is available at https://github.com/HaohanWang/CMM. BioMed Central 2021-02-05 /pmc/articles/PMC7866684/ /pubmed/33546598 http://dx.doi.org/10.1186/s12859-021-03959-2 Text en © The Author(s) 2021 Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
spellingShingle Methodology Article
Wang, Haohan
Pei, Fen
Vanyukov, Michael M.
Bahar, Ivet
Wu, Wei
Xing, Eric P.
Coupled mixed model for joint genetic analysis of complex disorders with two independently collected data sets
title Coupled mixed model for joint genetic analysis of complex disorders with two independently collected data sets
title_full Coupled mixed model for joint genetic analysis of complex disorders with two independently collected data sets
title_fullStr Coupled mixed model for joint genetic analysis of complex disorders with two independently collected data sets
title_full_unstemmed Coupled mixed model for joint genetic analysis of complex disorders with two independently collected data sets
title_short Coupled mixed model for joint genetic analysis of complex disorders with two independently collected data sets
title_sort coupled mixed model for joint genetic analysis of complex disorders with two independently collected data sets
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7866684/
https://www.ncbi.nlm.nih.gov/pubmed/33546598
http://dx.doi.org/10.1186/s12859-021-03959-2
work_keys_str_mv AT wanghaohan coupledmixedmodelforjointgeneticanalysisofcomplexdisorderswithtwoindependentlycollecteddatasets
AT peifen coupledmixedmodelforjointgeneticanalysisofcomplexdisorderswithtwoindependentlycollecteddatasets
AT vanyukovmichaelm coupledmixedmodelforjointgeneticanalysisofcomplexdisorderswithtwoindependentlycollecteddatasets
AT baharivet coupledmixedmodelforjointgeneticanalysisofcomplexdisorderswithtwoindependentlycollecteddatasets
AT wuwei coupledmixedmodelforjointgeneticanalysisofcomplexdisorderswithtwoindependentlycollecteddatasets
AT xingericp coupledmixedmodelforjointgeneticanalysisofcomplexdisorderswithtwoindependentlycollecteddatasets