Cargando…
Logistic Bayesian LASSO for Genetic Association Analysis of Data from Complex Sampling Designs
Detecting gene-environment interactions (GXE) with rare variants is critical in dissecting the etiology of common diseases. Interactions with rare haplotype variants (rHTV) are of particular interest. At the same time, complex sampling designs, such as stratified random sampling, are becoming increa...
Autores principales: | , , , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
2017
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5572548/ https://www.ncbi.nlm.nih.gov/pubmed/28424482 http://dx.doi.org/10.1038/jhg.2017.43 |
_version_ | 1783259543733534720 |
---|---|
author | Zhang, Yuan Hofmann, Jonathan N. Purdue, Mark P. Lin, Shili Biswas, Swati |
author_facet | Zhang, Yuan Hofmann, Jonathan N. Purdue, Mark P. Lin, Shili Biswas, Swati |
author_sort | Zhang, Yuan |
collection | PubMed |
description | Detecting gene-environment interactions (GXE) with rare variants is critical in dissecting the etiology of common diseases. Interactions with rare haplotype variants (rHTV) are of particular interest. At the same time, complex sampling designs, such as stratified random sampling, are becoming increasingly popular for designing case-control studies, especially for recruiting controls. The US Kidney Cancer Study (KCS) is an example, wherein all available cases were included while the controls at each site were randomly selected from the population by frequency matching with cases based on age, sex, and race. There is currently no rHTV association method that can account for such a complex sampling design. To fill this gap, we consider logistic Bayesian LASSO (LBL), an existing rHTV approach for case-control data, and show that its model can easily accommodate the complex sampling design. We study two extensions that include stratifying variables either as main effects only or with additional modeling of their interactions with haplotypes. We conduct extensive simulation studies to compare the complex sampling methods with the original LBL methods. We find that when there is no interaction between haplotype and stratifying variables, both extensions perform well while the original LBL methods lead to inflated type I error rates. However, when such an interaction exists, it is necessary to include the interaction effect in the model to control the type I error. Finally, we analyze the KCS data and find a significant interaction between (current) smoking and a specific rHTV in the N-acetyltransferase 2 (NAT2) gene. |
format | Online Article Text |
id | pubmed-5572548 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2017 |
record_format | MEDLINE/PubMed |
spelling | pubmed-55725482017-10-20 Logistic Bayesian LASSO for Genetic Association Analysis of Data from Complex Sampling Designs Zhang, Yuan Hofmann, Jonathan N. Purdue, Mark P. Lin, Shili Biswas, Swati J Hum Genet Article Detecting gene-environment interactions (GXE) with rare variants is critical in dissecting the etiology of common diseases. Interactions with rare haplotype variants (rHTV) are of particular interest. At the same time, complex sampling designs, such as stratified random sampling, are becoming increasingly popular for designing case-control studies, especially for recruiting controls. The US Kidney Cancer Study (KCS) is an example, wherein all available cases were included while the controls at each site were randomly selected from the population by frequency matching with cases based on age, sex, and race. There is currently no rHTV association method that can account for such a complex sampling design. To fill this gap, we consider logistic Bayesian LASSO (LBL), an existing rHTV approach for case-control data, and show that its model can easily accommodate the complex sampling design. We study two extensions that include stratifying variables either as main effects only or with additional modeling of their interactions with haplotypes. We conduct extensive simulation studies to compare the complex sampling methods with the original LBL methods. We find that when there is no interaction between haplotype and stratifying variables, both extensions perform well while the original LBL methods lead to inflated type I error rates. However, when such an interaction exists, it is necessary to include the interaction effect in the model to control the type I error. Finally, we analyze the KCS data and find a significant interaction between (current) smoking and a specific rHTV in the N-acetyltransferase 2 (NAT2) gene. 2017-04-20 2017-09 /pmc/articles/PMC5572548/ /pubmed/28424482 http://dx.doi.org/10.1038/jhg.2017.43 Text en Users may view, print, copy, and download text and data-mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use: http://www.nature.com/authors/editorial_policies/license.html#terms |
spellingShingle | Article Zhang, Yuan Hofmann, Jonathan N. Purdue, Mark P. Lin, Shili Biswas, Swati Logistic Bayesian LASSO for Genetic Association Analysis of Data from Complex Sampling Designs |
title | Logistic Bayesian LASSO for Genetic Association Analysis of Data from Complex Sampling Designs |
title_full | Logistic Bayesian LASSO for Genetic Association Analysis of Data from Complex Sampling Designs |
title_fullStr | Logistic Bayesian LASSO for Genetic Association Analysis of Data from Complex Sampling Designs |
title_full_unstemmed | Logistic Bayesian LASSO for Genetic Association Analysis of Data from Complex Sampling Designs |
title_short | Logistic Bayesian LASSO for Genetic Association Analysis of Data from Complex Sampling Designs |
title_sort | logistic bayesian lasso for genetic association analysis of data from complex sampling designs |
topic | Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5572548/ https://www.ncbi.nlm.nih.gov/pubmed/28424482 http://dx.doi.org/10.1038/jhg.2017.43 |
work_keys_str_mv | AT zhangyuan logisticbayesianlassoforgeneticassociationanalysisofdatafromcomplexsamplingdesigns AT hofmannjonathann logisticbayesianlassoforgeneticassociationanalysisofdatafromcomplexsamplingdesigns AT purduemarkp logisticbayesianlassoforgeneticassociationanalysisofdatafromcomplexsamplingdesigns AT linshili logisticbayesianlassoforgeneticassociationanalysisofdatafromcomplexsamplingdesigns AT biswasswati logisticbayesianlassoforgeneticassociationanalysisofdatafromcomplexsamplingdesigns |