Cargando…

CollapsABEL: an R library for detecting compound heterozygote alleles in genome-wide association studies

BACKGROUND: Compound Heterozygosity (CH) in classical genetics is the presence of two different recessive mutations at a particular gene locus. A relaxed form of CH alleles may account for an essential proportion of the missing heritability, i.e. heritability of phenotypes so far not accounted for b...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhong, Kaiyin, Karssen, Lennart C., Kayser, Manfred, Liu, Fan
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4826552/
https://www.ncbi.nlm.nih.gov/pubmed/27059780
http://dx.doi.org/10.1186/s12859-016-1006-9
_version_ 1782426353691262976
author Zhong, Kaiyin
Karssen, Lennart C.
Kayser, Manfred
Liu, Fan
author_facet Zhong, Kaiyin
Karssen, Lennart C.
Kayser, Manfred
Liu, Fan
author_sort Zhong, Kaiyin
collection PubMed
description BACKGROUND: Compound Heterozygosity (CH) in classical genetics is the presence of two different recessive mutations at a particular gene locus. A relaxed form of CH alleles may account for an essential proportion of the missing heritability, i.e. heritability of phenotypes so far not accounted for by single genetic variants. Methods to detect CH-like effects in genome-wide association studies (GWAS) may facilitate explaining the missing heritability, but to our knowledge no viable software tools for this purpose are currently available. RESULTS: In this work we present the Generalized Compound Double Heterozygosity (GCDH) test and its implementation in the R package CollapsABEL. Time-consuming procedures are optimized for computational efficiency using Java or C++. Intermediate results are stored either in an SQL database or in a so-called big.matrix file to achieve reasonable memory footprint. Our large scale simulation studies show that GCDH is capable of discovering genetic associations due to CH-like interactions with much higher power than a conventional single-SNP approach under various settings, whether the causal genetic variations are available or not. CollapsABEL provides a user-friendly pipeline for genotype collapsing, statistical testing, power estimation, type I error control and graphics generation in the R language. CONCLUSIONS: CollapsABEL provides a computationally efficient solution for screening general forms of CH alleles in densely imputed microarray or whole genome sequencing datasets. The GCDH test provides an improved power over single-SNP based methods in detecting the prevalence of CH in human complex phenotypes, offering an opportunity for tackling the missing heritability problem. Binary and source packages of CollapsABEL are available on CRAN (https://cran.r-project.org/web/packages/CollapsABEL) and the website of the GenABEL project (http://www.genabel.org/packages). ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1006-9) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4826552
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-48265522016-04-10 CollapsABEL: an R library for detecting compound heterozygote alleles in genome-wide association studies Zhong, Kaiyin Karssen, Lennart C. Kayser, Manfred Liu, Fan BMC Bioinformatics Software BACKGROUND: Compound Heterozygosity (CH) in classical genetics is the presence of two different recessive mutations at a particular gene locus. A relaxed form of CH alleles may account for an essential proportion of the missing heritability, i.e. heritability of phenotypes so far not accounted for by single genetic variants. Methods to detect CH-like effects in genome-wide association studies (GWAS) may facilitate explaining the missing heritability, but to our knowledge no viable software tools for this purpose are currently available. RESULTS: In this work we present the Generalized Compound Double Heterozygosity (GCDH) test and its implementation in the R package CollapsABEL. Time-consuming procedures are optimized for computational efficiency using Java or C++. Intermediate results are stored either in an SQL database or in a so-called big.matrix file to achieve reasonable memory footprint. Our large scale simulation studies show that GCDH is capable of discovering genetic associations due to CH-like interactions with much higher power than a conventional single-SNP approach under various settings, whether the causal genetic variations are available or not. CollapsABEL provides a user-friendly pipeline for genotype collapsing, statistical testing, power estimation, type I error control and graphics generation in the R language. CONCLUSIONS: CollapsABEL provides a computationally efficient solution for screening general forms of CH alleles in densely imputed microarray or whole genome sequencing datasets. The GCDH test provides an improved power over single-SNP based methods in detecting the prevalence of CH in human complex phenotypes, offering an opportunity for tackling the missing heritability problem. Binary and source packages of CollapsABEL are available on CRAN (https://cran.r-project.org/web/packages/CollapsABEL) and the website of the GenABEL project (http://www.genabel.org/packages). ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-016-1006-9) contains supplementary material, which is available to authorized users. BioMed Central 2016-04-08 /pmc/articles/PMC4826552/ /pubmed/27059780 http://dx.doi.org/10.1186/s12859-016-1006-9 Text en © Zhong et al. 2016 Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Software
Zhong, Kaiyin
Karssen, Lennart C.
Kayser, Manfred
Liu, Fan
CollapsABEL: an R library for detecting compound heterozygote alleles in genome-wide association studies
title CollapsABEL: an R library for detecting compound heterozygote alleles in genome-wide association studies
title_full CollapsABEL: an R library for detecting compound heterozygote alleles in genome-wide association studies
title_fullStr CollapsABEL: an R library for detecting compound heterozygote alleles in genome-wide association studies
title_full_unstemmed CollapsABEL: an R library for detecting compound heterozygote alleles in genome-wide association studies
title_short CollapsABEL: an R library for detecting compound heterozygote alleles in genome-wide association studies
title_sort collapsabel: an r library for detecting compound heterozygote alleles in genome-wide association studies
topic Software
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4826552/
https://www.ncbi.nlm.nih.gov/pubmed/27059780
http://dx.doi.org/10.1186/s12859-016-1006-9
work_keys_str_mv AT zhongkaiyin collapsabelanrlibraryfordetectingcompoundheterozygoteallelesingenomewideassociationstudies
AT karssenlennartc collapsabelanrlibraryfordetectingcompoundheterozygoteallelesingenomewideassociationstudies
AT kaysermanfred collapsabelanrlibraryfordetectingcompoundheterozygoteallelesingenomewideassociationstudies
AT liufan collapsabelanrlibraryfordetectingcompoundheterozygoteallelesingenomewideassociationstudies