Cargando…

The choice of null distributions for detecting gene-gene interactions in genome-wide association studies

BACKGROUND: In genome-wide association studies (GWAS), the number of single-nucleotide polymorphisms (SNPs) typically ranges between 500,000 and 1,000,000. Accordingly, detecting gene-gene interactions in GWAS is computationally challenging because it involves hundreds of billions of SNP pairs. Stag...

Descripción completa

Detalles Bibliográficos
Autores principales:	Yang, Can, Wan, Xiang, He, Zengyou, Yang, Qiang, Xue, Hong, Yu, Weichuan
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2011
Materias:	Research
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3044281/ https://www.ncbi.nlm.nih.gov/pubmed/21342556 http://dx.doi.org/10.1186/1471-2105-12-S1-S26

_version_	1782198710010118144
author	Yang, Can Wan, Xiang He, Zengyou Yang, Qiang Xue, Hong Yu, Weichuan
author_facet	Yang, Can Wan, Xiang He, Zengyou Yang, Qiang Xue, Hong Yu, Weichuan
author_sort	Yang, Can
collection	PubMed
description	BACKGROUND: In genome-wide association studies (GWAS), the number of single-nucleotide polymorphisms (SNPs) typically ranges between 500,000 and 1,000,000. Accordingly, detecting gene-gene interactions in GWAS is computationally challenging because it involves hundreds of billions of SNP pairs. Stage-wise strategies are often used to overcome the computational difficulty. In the first stage, fast screening methods (e.g. Tuning ReliefF) are applied to reduce the whole SNP set to a small subset. In the second stage, sophisticated modeling methods (e.g., multifactor-dimensionality reduction (MDR)) are applied to the subset of SNPs to identify interesting interaction models and the corresponding interaction patterns. In the third stage, the significance of the identified interaction patterns is evaluated by hypothesis testing. RESULTS: In this paper, we show that this stage-wise strategy could be problematic in controlling the false positive rate if the null distribution is not appropriately chosen. This is because screening and modeling may change the null distribution used in hypothesis testing. In our simulation study, we use some popular screening methods and the popular modeling method MDR as examples to show the effect of the inappropriate choice of null distributions. To choose appropriate null distributions, we suggest to use the permutation test or testing on the independent data set. We demonstrate their performance using synthetic data and a real genome wide data set from an Aged-related Macular Degeneration (AMD) study. CONCLUSIONS: The permutation test or testing on the independent data set can help choosing appropriate null distributions in hypothesis testing, which provides more reliable results in practice.
format	Text
id	pubmed-3044281
institution	National Center for Biotechnology Information
language	English
publishDate	2011
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-30442812011-02-25 The choice of null distributions for detecting gene-gene interactions in genome-wide association studies Yang, Can Wan, Xiang He, Zengyou Yang, Qiang Xue, Hong Yu, Weichuan BMC Bioinformatics Research BACKGROUND: In genome-wide association studies (GWAS), the number of single-nucleotide polymorphisms (SNPs) typically ranges between 500,000 and 1,000,000. Accordingly, detecting gene-gene interactions in GWAS is computationally challenging because it involves hundreds of billions of SNP pairs. Stage-wise strategies are often used to overcome the computational difficulty. In the first stage, fast screening methods (e.g. Tuning ReliefF) are applied to reduce the whole SNP set to a small subset. In the second stage, sophisticated modeling methods (e.g., multifactor-dimensionality reduction (MDR)) are applied to the subset of SNPs to identify interesting interaction models and the corresponding interaction patterns. In the third stage, the significance of the identified interaction patterns is evaluated by hypothesis testing. RESULTS: In this paper, we show that this stage-wise strategy could be problematic in controlling the false positive rate if the null distribution is not appropriately chosen. This is because screening and modeling may change the null distribution used in hypothesis testing. In our simulation study, we use some popular screening methods and the popular modeling method MDR as examples to show the effect of the inappropriate choice of null distributions. To choose appropriate null distributions, we suggest to use the permutation test or testing on the independent data set. We demonstrate their performance using synthetic data and a real genome wide data set from an Aged-related Macular Degeneration (AMD) study. CONCLUSIONS: The permutation test or testing on the independent data set can help choosing appropriate null distributions in hypothesis testing, which provides more reliable results in practice. BioMed Central 2011-02-15 /pmc/articles/PMC3044281/ /pubmed/21342556 http://dx.doi.org/10.1186/1471-2105-12-S1-S26 Text en Copyright ©2011 Yang et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Yang, Can Wan, Xiang He, Zengyou Yang, Qiang Xue, Hong Yu, Weichuan The choice of null distributions for detecting gene-gene interactions in genome-wide association studies
title	The choice of null distributions for detecting gene-gene interactions in genome-wide association studies
title_full	The choice of null distributions for detecting gene-gene interactions in genome-wide association studies
title_fullStr	The choice of null distributions for detecting gene-gene interactions in genome-wide association studies
title_full_unstemmed	The choice of null distributions for detecting gene-gene interactions in genome-wide association studies
title_short	The choice of null distributions for detecting gene-gene interactions in genome-wide association studies
title_sort	choice of null distributions for detecting gene-gene interactions in genome-wide association studies
topic	Research
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3044281/ https://www.ncbi.nlm.nih.gov/pubmed/21342556 http://dx.doi.org/10.1186/1471-2105-12-S1-S26
work_keys_str_mv	AT yangcan thechoiceofnulldistributionsfordetectinggenegeneinteractionsingenomewideassociationstudies AT wanxiang thechoiceofnulldistributionsfordetectinggenegeneinteractionsingenomewideassociationstudies AT hezengyou thechoiceofnulldistributionsfordetectinggenegeneinteractionsingenomewideassociationstudies AT yangqiang thechoiceofnulldistributionsfordetectinggenegeneinteractionsingenomewideassociationstudies AT xuehong thechoiceofnulldistributionsfordetectinggenegeneinteractionsingenomewideassociationstudies AT yuweichuan thechoiceofnulldistributionsfordetectinggenegeneinteractionsingenomewideassociationstudies AT yangcan choiceofnulldistributionsfordetectinggenegeneinteractionsingenomewideassociationstudies AT wanxiang choiceofnulldistributionsfordetectinggenegeneinteractionsingenomewideassociationstudies AT hezengyou choiceofnulldistributionsfordetectinggenegeneinteractionsingenomewideassociationstudies AT yangqiang choiceofnulldistributionsfordetectinggenegeneinteractionsingenomewideassociationstudies AT xuehong choiceofnulldistributionsfordetectinggenegeneinteractionsingenomewideassociationstudies AT yuweichuan choiceofnulldistributionsfordetectinggenegeneinteractionsingenomewideassociationstudies

The choice of null distributions for detecting gene-gene interactions in genome-wide association studies

Ejemplares similares