Cargando…

Empirical estimation of genome-wide significance thresholds based on the 1000 Genomes Project data set

To assess the statistical significance of associations between variants and traits, genome-wide association studies (GWAS) should employ an appropriate threshold that accounts for the massive burden of multiple testing in the study. Although most studies in the current literature commonly set a geno...

Descripción completa

Detalles Bibliográficos
Autores principales: Kanai, Masahiro, Tanaka, Toshihiro, Okada, Yukinori
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Nature Publishing Group 2016
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5090169/
https://www.ncbi.nlm.nih.gov/pubmed/27305981
http://dx.doi.org/10.1038/jhg.2016.72
_version_ 1782464366032977920
author Kanai, Masahiro
Tanaka, Toshihiro
Okada, Yukinori
author_facet Kanai, Masahiro
Tanaka, Toshihiro
Okada, Yukinori
author_sort Kanai, Masahiro
collection PubMed
description To assess the statistical significance of associations between variants and traits, genome-wide association studies (GWAS) should employ an appropriate threshold that accounts for the massive burden of multiple testing in the study. Although most studies in the current literature commonly set a genome-wide significance threshold at the level of P=5.0 × 10(−8), the adequacy of this value for respective populations has not been fully investigated. To empirically estimate thresholds for different ancestral populations, we conducted GWAS simulations using the 1000 Genomes Phase 3 data set for Africans (AFR), Europeans (EUR), Admixed Americans (AMR), East Asians (EAS) and South Asians (SAS). The estimated empirical genome-wide significance thresholds were P(sig)=3.24 × 10(−8) (AFR), 9.26 × 10(−8) (EUR), 1.83 × 10(−7) (AMR), 1.61 × 10(−7) (EAS) and 9.46 × 10(−8) (SAS). We additionally conducted trans-ethnic meta-analyses across all populations (ALL) and all populations except for AFR (ΔAFR), which yielded P(sig)=3.25 × 10(−8) (ALL) and 4.20 × 10(−8) (ΔAFR). Our results indicate that the current threshold (P=5.0 × 10(−8)) is overly stringent for all ancestral populations except for Africans; however, we should employ a more stringent threshold when conducting a meta-analysis, regardless of the presence of African samples.
format Online
Article
Text
id pubmed-5090169
institution National Center for Biotechnology Information
language English
publishDate 2016
publisher Nature Publishing Group
record_format MEDLINE/PubMed
spelling pubmed-50901692016-11-18 Empirical estimation of genome-wide significance thresholds based on the 1000 Genomes Project data set Kanai, Masahiro Tanaka, Toshihiro Okada, Yukinori J Hum Genet Original Article To assess the statistical significance of associations between variants and traits, genome-wide association studies (GWAS) should employ an appropriate threshold that accounts for the massive burden of multiple testing in the study. Although most studies in the current literature commonly set a genome-wide significance threshold at the level of P=5.0 × 10(−8), the adequacy of this value for respective populations has not been fully investigated. To empirically estimate thresholds for different ancestral populations, we conducted GWAS simulations using the 1000 Genomes Phase 3 data set for Africans (AFR), Europeans (EUR), Admixed Americans (AMR), East Asians (EAS) and South Asians (SAS). The estimated empirical genome-wide significance thresholds were P(sig)=3.24 × 10(−8) (AFR), 9.26 × 10(−8) (EUR), 1.83 × 10(−7) (AMR), 1.61 × 10(−7) (EAS) and 9.46 × 10(−8) (SAS). We additionally conducted trans-ethnic meta-analyses across all populations (ALL) and all populations except for AFR (ΔAFR), which yielded P(sig)=3.25 × 10(−8) (ALL) and 4.20 × 10(−8) (ΔAFR). Our results indicate that the current threshold (P=5.0 × 10(−8)) is overly stringent for all ancestral populations except for Africans; however, we should employ a more stringent threshold when conducting a meta-analysis, regardless of the presence of African samples. Nature Publishing Group 2016-10 2016-06-16 /pmc/articles/PMC5090169/ /pubmed/27305981 http://dx.doi.org/10.1038/jhg.2016.72 Text en Copyright © 2016 The Japan Society of Human Genetics http://creativecommons.org/licenses/by-nc-sa/4.0/ This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in the credit line; if the material is not included under the Creative Commons license, users will need to obtain permission from the license holder to reproduce the material. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/4.0/
spellingShingle Original Article
Kanai, Masahiro
Tanaka, Toshihiro
Okada, Yukinori
Empirical estimation of genome-wide significance thresholds based on the 1000 Genomes Project data set
title Empirical estimation of genome-wide significance thresholds based on the 1000 Genomes Project data set
title_full Empirical estimation of genome-wide significance thresholds based on the 1000 Genomes Project data set
title_fullStr Empirical estimation of genome-wide significance thresholds based on the 1000 Genomes Project data set
title_full_unstemmed Empirical estimation of genome-wide significance thresholds based on the 1000 Genomes Project data set
title_short Empirical estimation of genome-wide significance thresholds based on the 1000 Genomes Project data set
title_sort empirical estimation of genome-wide significance thresholds based on the 1000 genomes project data set
topic Original Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5090169/
https://www.ncbi.nlm.nih.gov/pubmed/27305981
http://dx.doi.org/10.1038/jhg.2016.72
work_keys_str_mv AT kanaimasahiro empiricalestimationofgenomewidesignificancethresholdsbasedonthe1000genomesprojectdataset
AT tanakatoshihiro empiricalestimationofgenomewidesignificancethresholdsbasedonthe1000genomesprojectdataset
AT okadayukinori empiricalestimationofgenomewidesignificancethresholdsbasedonthe1000genomesprojectdataset