Cargando…
Statistical distributions of test statistics used for quantitative trait association mapping in structured populations
BACKGROUND: Spurious associations between single nucleotide polymorphisms and phenotypes are a major issue in genome-wide association studies and have led to underestimation of type 1 error rate and overestimation of the number of quantitative trait loci found. Many authors have investigated the inf...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2012
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3817592/ https://www.ncbi.nlm.nih.gov/pubmed/23146127 http://dx.doi.org/10.1186/1297-9686-44-32 |
_version_ | 1782478096285302784 |
---|---|
author | Teyssèdre, Simon Elsen, Jean-Michel Ricard, Anne |
author_facet | Teyssèdre, Simon Elsen, Jean-Michel Ricard, Anne |
author_sort | Teyssèdre, Simon |
collection | PubMed |
description | BACKGROUND: Spurious associations between single nucleotide polymorphisms and phenotypes are a major issue in genome-wide association studies and have led to underestimation of type 1 error rate and overestimation of the number of quantitative trait loci found. Many authors have investigated the influence of population structure on the robustness of methods by simulation. This paper is aimed at developing further the algebraic formalization of power and type 1 error rate for some of the classical statistical methods used: simple regression, two approximate methods of mixed models involving the effect of a single nucleotide polymorphism (SNP) and a random polygenic effect (GRAMMAR and FASTA) and the transmission/disequilibrium test for quantitative traits and nuclear families. Analytical formulae were derived using matrix algebra for the first and second moments of the statistical tests, assuming a true mixed model with a polygenic effect and SNP effects. RESULTS: The expectation and variance of the test statistics and their marginal expectations and variances according to the distribution of genotypes and estimators of variance components are given as a function of the relationship matrix and of the heritability of the polygenic effect. These formulae were used to compute type 1 error rate and power for any kind of relationship matrix between phenotyped and genotyped individuals for any level of heritability. For the regression method, type 1 error rate increased with the variability of relationships and with heritability, but decreased with the GRAMMAR method and was not affected with the FASTA and quantitative transmission/disequilibrium test methods. CONCLUSIONS: The formulae can be easily used to provide the correct threshold of type 1 error rate and to calculate the power when designing experiments or data collection protocols. The results concerning the efficacy of each method agree with simulation results in the literature but were generalized in this work. The power of the GRAMMAR method was equal to the power of the FASTA method at the same type 1 error rate. The power of the quantitative transmission/disequilibrium test was low. In conclusion, the FASTA method, which is very close to the full mixed model, is recommended in association mapping studies. |
format | Online Article Text |
id | pubmed-3817592 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2012 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-38175922013-11-07 Statistical distributions of test statistics used for quantitative trait association mapping in structured populations Teyssèdre, Simon Elsen, Jean-Michel Ricard, Anne Genet Sel Evol Research BACKGROUND: Spurious associations between single nucleotide polymorphisms and phenotypes are a major issue in genome-wide association studies and have led to underestimation of type 1 error rate and overestimation of the number of quantitative trait loci found. Many authors have investigated the influence of population structure on the robustness of methods by simulation. This paper is aimed at developing further the algebraic formalization of power and type 1 error rate for some of the classical statistical methods used: simple regression, two approximate methods of mixed models involving the effect of a single nucleotide polymorphism (SNP) and a random polygenic effect (GRAMMAR and FASTA) and the transmission/disequilibrium test for quantitative traits and nuclear families. Analytical formulae were derived using matrix algebra for the first and second moments of the statistical tests, assuming a true mixed model with a polygenic effect and SNP effects. RESULTS: The expectation and variance of the test statistics and their marginal expectations and variances according to the distribution of genotypes and estimators of variance components are given as a function of the relationship matrix and of the heritability of the polygenic effect. These formulae were used to compute type 1 error rate and power for any kind of relationship matrix between phenotyped and genotyped individuals for any level of heritability. For the regression method, type 1 error rate increased with the variability of relationships and with heritability, but decreased with the GRAMMAR method and was not affected with the FASTA and quantitative transmission/disequilibrium test methods. CONCLUSIONS: The formulae can be easily used to provide the correct threshold of type 1 error rate and to calculate the power when designing experiments or data collection protocols. The results concerning the efficacy of each method agree with simulation results in the literature but were generalized in this work. The power of the GRAMMAR method was equal to the power of the FASTA method at the same type 1 error rate. The power of the quantitative transmission/disequilibrium test was low. In conclusion, the FASTA method, which is very close to the full mixed model, is recommended in association mapping studies. BioMed Central 2012-11-12 /pmc/articles/PMC3817592/ /pubmed/23146127 http://dx.doi.org/10.1186/1297-9686-44-32 Text en Copyright © 2012 Teyssèdre et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Teyssèdre, Simon Elsen, Jean-Michel Ricard, Anne Statistical distributions of test statistics used for quantitative trait association mapping in structured populations |
title | Statistical distributions of test statistics used for quantitative trait association mapping in structured populations |
title_full | Statistical distributions of test statistics used for quantitative trait association mapping in structured populations |
title_fullStr | Statistical distributions of test statistics used for quantitative trait association mapping in structured populations |
title_full_unstemmed | Statistical distributions of test statistics used for quantitative trait association mapping in structured populations |
title_short | Statistical distributions of test statistics used for quantitative trait association mapping in structured populations |
title_sort | statistical distributions of test statistics used for quantitative trait association mapping in structured populations |
topic | Research |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3817592/ https://www.ncbi.nlm.nih.gov/pubmed/23146127 http://dx.doi.org/10.1186/1297-9686-44-32 |
work_keys_str_mv | AT teyssedresimon statisticaldistributionsofteststatisticsusedforquantitativetraitassociationmappinginstructuredpopulations AT elsenjeanmichel statisticaldistributionsofteststatisticsusedforquantitativetraitassociationmappinginstructuredpopulations AT ricardanne statisticaldistributionsofteststatisticsusedforquantitativetraitassociationmappinginstructuredpopulations |