Cargando…

Comparison of multimarker logistic regression models, with application to a genomewide scan of schizophrenia

BACKGROUND: Genome-wide association studies (GWAS) are a widely used study design for detecting genetic causes of complex diseases. Current studies provide good coverage of common causal SNPs, but not rare ones. A popular method to detect rare causal variants is haplotype testing. A disadvantage of...

Descripción completa

Detalles Bibliográficos
Autores principales:	Wason, James MS, Dudbridge, Frank
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2010
Materias:	Methodology Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2949738/ https://www.ncbi.nlm.nih.gov/pubmed/20828390 http://dx.doi.org/10.1186/1471-2156-11-80

_version_	1782187567690547200
author	Wason, James MS Dudbridge, Frank
author_facet	Wason, James MS Dudbridge, Frank
author_sort	Wason, James MS
collection	PubMed
description	BACKGROUND: Genome-wide association studies (GWAS) are a widely used study design for detecting genetic causes of complex diseases. Current studies provide good coverage of common causal SNPs, but not rare ones. A popular method to detect rare causal variants is haplotype testing. A disadvantage of this approach is that many parameters are estimated simultaneously, which can mean a loss of power and slower fitting to large datasets. Haplotype testing effectively tests both the allele frequencies and the linkage disequilibrium (LD) structure of the data. LD has previously been shown to be mostly attributable to LD between adjacent SNPs. We propose a generalised linear model (GLM) which models the effects of each SNP in a region as well as the statistical interactions between adjacent pairs. This is compared to two other commonly used multimarker GLMs: one with a main-effect parameter for each SNP; one with a parameter for each haplotype. RESULTS: We show the haplotype model has higher power for rare untyped causal SNPs, the main-effects model has higher power for common untyped causal SNPs, and the proposed model generally has power in between the two others. We show that the relative power of the three methods is dependent on the number of marker haplotypes the causal allele is present on, which depends on the age of the mutation. Except in the case of a common causal variant in high LD with markers, all three multimarker models are superior in power to single-SNP tests. Including the adjacent statistical interactions results in lower inflation in test statistics when a realistic level of population stratification is present in a dataset. Using the multimarker models, we analyse data from the Molecular Genetics of Schizophrenia study. The multimarker models find potential associations that are not found by single-SNP tests. However, multimarker models also require stricter control of data quality since biases can have a larger inflationary effect on multimarker test statistics than on single-SNP test statistics. CONCLUSIONS: Analysing a GWAS with multimarker models can yield candidate regions which may contain rare untyped causal variants. This is useful for increasing prior odds of association in future whole-genome sequence analyses.
format	Text
id	pubmed-2949738
institution	National Center for Biotechnology Information
language	English
publishDate	2010
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-29497382010-11-03 Comparison of multimarker logistic regression models, with application to a genomewide scan of schizophrenia Wason, James MS Dudbridge, Frank BMC Genet Methodology Article BACKGROUND: Genome-wide association studies (GWAS) are a widely used study design for detecting genetic causes of complex diseases. Current studies provide good coverage of common causal SNPs, but not rare ones. A popular method to detect rare causal variants is haplotype testing. A disadvantage of this approach is that many parameters are estimated simultaneously, which can mean a loss of power and slower fitting to large datasets. Haplotype testing effectively tests both the allele frequencies and the linkage disequilibrium (LD) structure of the data. LD has previously been shown to be mostly attributable to LD between adjacent SNPs. We propose a generalised linear model (GLM) which models the effects of each SNP in a region as well as the statistical interactions between adjacent pairs. This is compared to two other commonly used multimarker GLMs: one with a main-effect parameter for each SNP; one with a parameter for each haplotype. RESULTS: We show the haplotype model has higher power for rare untyped causal SNPs, the main-effects model has higher power for common untyped causal SNPs, and the proposed model generally has power in between the two others. We show that the relative power of the three methods is dependent on the number of marker haplotypes the causal allele is present on, which depends on the age of the mutation. Except in the case of a common causal variant in high LD with markers, all three multimarker models are superior in power to single-SNP tests. Including the adjacent statistical interactions results in lower inflation in test statistics when a realistic level of population stratification is present in a dataset. Using the multimarker models, we analyse data from the Molecular Genetics of Schizophrenia study. The multimarker models find potential associations that are not found by single-SNP tests. However, multimarker models also require stricter control of data quality since biases can have a larger inflationary effect on multimarker test statistics than on single-SNP test statistics. CONCLUSIONS: Analysing a GWAS with multimarker models can yield candidate regions which may contain rare untyped causal variants. This is useful for increasing prior odds of association in future whole-genome sequence analyses. BioMed Central 2010-09-09 /pmc/articles/PMC2949738/ /pubmed/20828390 http://dx.doi.org/10.1186/1471-2156-11-80 Text en Copyright ©2010 Wason and Dudbridge; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Methodology Article Wason, James MS Dudbridge, Frank Comparison of multimarker logistic regression models, with application to a genomewide scan of schizophrenia
title	Comparison of multimarker logistic regression models, with application to a genomewide scan of schizophrenia
title_full	Comparison of multimarker logistic regression models, with application to a genomewide scan of schizophrenia
title_fullStr	Comparison of multimarker logistic regression models, with application to a genomewide scan of schizophrenia
title_full_unstemmed	Comparison of multimarker logistic regression models, with application to a genomewide scan of schizophrenia
title_short	Comparison of multimarker logistic regression models, with application to a genomewide scan of schizophrenia
title_sort	comparison of multimarker logistic regression models, with application to a genomewide scan of schizophrenia
topic	Methodology Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2949738/ https://www.ncbi.nlm.nih.gov/pubmed/20828390 http://dx.doi.org/10.1186/1471-2156-11-80
work_keys_str_mv	AT wasonjamesms comparisonofmultimarkerlogisticregressionmodelswithapplicationtoagenomewidescanofschizophrenia AT dudbridgefrank comparisonofmultimarkerlogisticregressionmodelswithapplicationtoagenomewidescanofschizophrenia

Comparison of multimarker logistic regression models, with application to a genomewide scan of schizophrenia

Ejemplares similares