Cargando…
Comparison of multimarker logistic regression models, with application to a genomewide scan of schizophrenia
BACKGROUND: Genome-wide association studies (GWAS) are a widely used study design for detecting genetic causes of complex diseases. Current studies provide good coverage of common causal SNPs, but not rare ones. A popular method to detect rare causal variants is haplotype testing. A disadvantage of...
Autores principales: | , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2010
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2949738/ https://www.ncbi.nlm.nih.gov/pubmed/20828390 http://dx.doi.org/10.1186/1471-2156-11-80 |
_version_ | 1782187567690547200 |
---|---|
author | Wason, James MS Dudbridge, Frank |
author_facet | Wason, James MS Dudbridge, Frank |
author_sort | Wason, James MS |
collection | PubMed |
description | BACKGROUND: Genome-wide association studies (GWAS) are a widely used study design for detecting genetic causes of complex diseases. Current studies provide good coverage of common causal SNPs, but not rare ones. A popular method to detect rare causal variants is haplotype testing. A disadvantage of this approach is that many parameters are estimated simultaneously, which can mean a loss of power and slower fitting to large datasets. Haplotype testing effectively tests both the allele frequencies and the linkage disequilibrium (LD) structure of the data. LD has previously been shown to be mostly attributable to LD between adjacent SNPs. We propose a generalised linear model (GLM) which models the effects of each SNP in a region as well as the statistical interactions between adjacent pairs. This is compared to two other commonly used multimarker GLMs: one with a main-effect parameter for each SNP; one with a parameter for each haplotype. RESULTS: We show the haplotype model has higher power for rare untyped causal SNPs, the main-effects model has higher power for common untyped causal SNPs, and the proposed model generally has power in between the two others. We show that the relative power of the three methods is dependent on the number of marker haplotypes the causal allele is present on, which depends on the age of the mutation. Except in the case of a common causal variant in high LD with markers, all three multimarker models are superior in power to single-SNP tests. Including the adjacent statistical interactions results in lower inflation in test statistics when a realistic level of population stratification is present in a dataset. Using the multimarker models, we analyse data from the Molecular Genetics of Schizophrenia study. The multimarker models find potential associations that are not found by single-SNP tests. However, multimarker models also require stricter control of data quality since biases can have a larger inflationary effect on multimarker test statistics than on single-SNP test statistics. CONCLUSIONS: Analysing a GWAS with multimarker models can yield candidate regions which may contain rare untyped causal variants. This is useful for increasing prior odds of association in future whole-genome sequence analyses. |
format | Text |
id | pubmed-2949738 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2010 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-29497382010-11-03 Comparison of multimarker logistic regression models, with application to a genomewide scan of schizophrenia Wason, James MS Dudbridge, Frank BMC Genet Methodology Article BACKGROUND: Genome-wide association studies (GWAS) are a widely used study design for detecting genetic causes of complex diseases. Current studies provide good coverage of common causal SNPs, but not rare ones. A popular method to detect rare causal variants is haplotype testing. A disadvantage of this approach is that many parameters are estimated simultaneously, which can mean a loss of power and slower fitting to large datasets. Haplotype testing effectively tests both the allele frequencies and the linkage disequilibrium (LD) structure of the data. LD has previously been shown to be mostly attributable to LD between adjacent SNPs. We propose a generalised linear model (GLM) which models the effects of each SNP in a region as well as the statistical interactions between adjacent pairs. This is compared to two other commonly used multimarker GLMs: one with a main-effect parameter for each SNP; one with a parameter for each haplotype. RESULTS: We show the haplotype model has higher power for rare untyped causal SNPs, the main-effects model has higher power for common untyped causal SNPs, and the proposed model generally has power in between the two others. We show that the relative power of the three methods is dependent on the number of marker haplotypes the causal allele is present on, which depends on the age of the mutation. Except in the case of a common causal variant in high LD with markers, all three multimarker models are superior in power to single-SNP tests. Including the adjacent statistical interactions results in lower inflation in test statistics when a realistic level of population stratification is present in a dataset. Using the multimarker models, we analyse data from the Molecular Genetics of Schizophrenia study. The multimarker models find potential associations that are not found by single-SNP tests. However, multimarker models also require stricter control of data quality since biases can have a larger inflationary effect on multimarker test statistics than on single-SNP test statistics. CONCLUSIONS: Analysing a GWAS with multimarker models can yield candidate regions which may contain rare untyped causal variants. This is useful for increasing prior odds of association in future whole-genome sequence analyses. BioMed Central 2010-09-09 /pmc/articles/PMC2949738/ /pubmed/20828390 http://dx.doi.org/10.1186/1471-2156-11-80 Text en Copyright ©2010 Wason and Dudbridge; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methodology Article Wason, James MS Dudbridge, Frank Comparison of multimarker logistic regression models, with application to a genomewide scan of schizophrenia |
title | Comparison of multimarker logistic regression models, with application to a genomewide scan of schizophrenia |
title_full | Comparison of multimarker logistic regression models, with application to a genomewide scan of schizophrenia |
title_fullStr | Comparison of multimarker logistic regression models, with application to a genomewide scan of schizophrenia |
title_full_unstemmed | Comparison of multimarker logistic regression models, with application to a genomewide scan of schizophrenia |
title_short | Comparison of multimarker logistic regression models, with application to a genomewide scan of schizophrenia |
title_sort | comparison of multimarker logistic regression models, with application to a genomewide scan of schizophrenia |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2949738/ https://www.ncbi.nlm.nih.gov/pubmed/20828390 http://dx.doi.org/10.1186/1471-2156-11-80 |
work_keys_str_mv | AT wasonjamesms comparisonofmultimarkerlogisticregressionmodelswithapplicationtoagenomewidescanofschizophrenia AT dudbridgefrank comparisonofmultimarkerlogisticregressionmodelswithapplicationtoagenomewidescanofschizophrenia |