Cargando…

Logistic Bayesian LASSO for Genetic Association Analysis of Data from Complex Sampling Designs

Detecting gene-environment interactions (GXE) with rare variants is critical in dissecting the etiology of common diseases. Interactions with rare haplotype variants (rHTV) are of particular interest. At the same time, complex sampling designs, such as stratified random sampling, are becoming increa...

Descripción completa

Detalles Bibliográficos
Autores principales: Zhang, Yuan, Hofmann, Jonathan N., Purdue, Mark P., Lin, Shili, Biswas, Swati
Formato: Online Artículo Texto
Lenguaje:English
Publicado: 2017
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5572548/
https://www.ncbi.nlm.nih.gov/pubmed/28424482
http://dx.doi.org/10.1038/jhg.2017.43
_version_ 1783259543733534720
author Zhang, Yuan
Hofmann, Jonathan N.
Purdue, Mark P.
Lin, Shili
Biswas, Swati
author_facet Zhang, Yuan
Hofmann, Jonathan N.
Purdue, Mark P.
Lin, Shili
Biswas, Swati
author_sort Zhang, Yuan
collection PubMed
description Detecting gene-environment interactions (GXE) with rare variants is critical in dissecting the etiology of common diseases. Interactions with rare haplotype variants (rHTV) are of particular interest. At the same time, complex sampling designs, such as stratified random sampling, are becoming increasingly popular for designing case-control studies, especially for recruiting controls. The US Kidney Cancer Study (KCS) is an example, wherein all available cases were included while the controls at each site were randomly selected from the population by frequency matching with cases based on age, sex, and race. There is currently no rHTV association method that can account for such a complex sampling design. To fill this gap, we consider logistic Bayesian LASSO (LBL), an existing rHTV approach for case-control data, and show that its model can easily accommodate the complex sampling design. We study two extensions that include stratifying variables either as main effects only or with additional modeling of their interactions with haplotypes. We conduct extensive simulation studies to compare the complex sampling methods with the original LBL methods. We find that when there is no interaction between haplotype and stratifying variables, both extensions perform well while the original LBL methods lead to inflated type I error rates. However, when such an interaction exists, it is necessary to include the interaction effect in the model to control the type I error. Finally, we analyze the KCS data and find a significant interaction between (current) smoking and a specific rHTV in the N-acetyltransferase 2 (NAT2) gene.
format Online
Article
Text
id pubmed-5572548
institution National Center for Biotechnology Information
language English
publishDate 2017
record_format MEDLINE/PubMed
spelling pubmed-55725482017-10-20 Logistic Bayesian LASSO for Genetic Association Analysis of Data from Complex Sampling Designs Zhang, Yuan Hofmann, Jonathan N. Purdue, Mark P. Lin, Shili Biswas, Swati J Hum Genet Article Detecting gene-environment interactions (GXE) with rare variants is critical in dissecting the etiology of common diseases. Interactions with rare haplotype variants (rHTV) are of particular interest. At the same time, complex sampling designs, such as stratified random sampling, are becoming increasingly popular for designing case-control studies, especially for recruiting controls. The US Kidney Cancer Study (KCS) is an example, wherein all available cases were included while the controls at each site were randomly selected from the population by frequency matching with cases based on age, sex, and race. There is currently no rHTV association method that can account for such a complex sampling design. To fill this gap, we consider logistic Bayesian LASSO (LBL), an existing rHTV approach for case-control data, and show that its model can easily accommodate the complex sampling design. We study two extensions that include stratifying variables either as main effects only or with additional modeling of their interactions with haplotypes. We conduct extensive simulation studies to compare the complex sampling methods with the original LBL methods. We find that when there is no interaction between haplotype and stratifying variables, both extensions perform well while the original LBL methods lead to inflated type I error rates. However, when such an interaction exists, it is necessary to include the interaction effect in the model to control the type I error. Finally, we analyze the KCS data and find a significant interaction between (current) smoking and a specific rHTV in the N-acetyltransferase 2 (NAT2) gene. 2017-04-20 2017-09 /pmc/articles/PMC5572548/ /pubmed/28424482 http://dx.doi.org/10.1038/jhg.2017.43 Text en Users may view, print, copy, and download text and data-mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use: http://www.nature.com/authors/editorial_policies/license.html#terms
spellingShingle Article
Zhang, Yuan
Hofmann, Jonathan N.
Purdue, Mark P.
Lin, Shili
Biswas, Swati
Logistic Bayesian LASSO for Genetic Association Analysis of Data from Complex Sampling Designs
title Logistic Bayesian LASSO for Genetic Association Analysis of Data from Complex Sampling Designs
title_full Logistic Bayesian LASSO for Genetic Association Analysis of Data from Complex Sampling Designs
title_fullStr Logistic Bayesian LASSO for Genetic Association Analysis of Data from Complex Sampling Designs
title_full_unstemmed Logistic Bayesian LASSO for Genetic Association Analysis of Data from Complex Sampling Designs
title_short Logistic Bayesian LASSO for Genetic Association Analysis of Data from Complex Sampling Designs
title_sort logistic bayesian lasso for genetic association analysis of data from complex sampling designs
topic Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5572548/
https://www.ncbi.nlm.nih.gov/pubmed/28424482
http://dx.doi.org/10.1038/jhg.2017.43
work_keys_str_mv AT zhangyuan logisticbayesianlassoforgeneticassociationanalysisofdatafromcomplexsamplingdesigns
AT hofmannjonathann logisticbayesianlassoforgeneticassociationanalysisofdatafromcomplexsamplingdesigns
AT purduemarkp logisticbayesianlassoforgeneticassociationanalysisofdatafromcomplexsamplingdesigns
AT linshili logisticbayesianlassoforgeneticassociationanalysisofdatafromcomplexsamplingdesigns
AT biswasswati logisticbayesianlassoforgeneticassociationanalysisofdatafromcomplexsamplingdesigns