Cargando…

Properties of permutation-based gene tests and controlling type 1 error using a summary statistic based gene test

BACKGROUND: The advent of genome-wide association studies has led to many novel disease-SNP associations, opening the door to focused study on their biological underpinnings. Because of the importance of analyzing these associations, numerous statistical methods have been devoted to them. However, f...

Descripción completa

Detalles Bibliográficos
Autores principales: Swanson, David M, Blacker, Deborah, AlChawa, Taofik, Ludwig, Kerstin U, Mangold, Elisabeth, Lange, Christoph
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3831057/
https://www.ncbi.nlm.nih.gov/pubmed/24199751
http://dx.doi.org/10.1186/1471-2156-14-108
_version_ 1782291565758119936
author Swanson, David M
Blacker, Deborah
AlChawa, Taofik
Ludwig, Kerstin U
Mangold, Elisabeth
Lange, Christoph
author_facet Swanson, David M
Blacker, Deborah
AlChawa, Taofik
Ludwig, Kerstin U
Mangold, Elisabeth
Lange, Christoph
author_sort Swanson, David M
collection PubMed
description BACKGROUND: The advent of genome-wide association studies has led to many novel disease-SNP associations, opening the door to focused study on their biological underpinnings. Because of the importance of analyzing these associations, numerous statistical methods have been devoted to them. However, fewer methods have attempted to associate entire genes or genomic regions with outcomes, which is potentially more useful knowledge from a biological perspective and those methods currently implemented are often permutation-based. RESULTS: One property of some permutation-based tests is that their power varies as a function of whether significant markers are in regions of linkage disequilibrium (LD) or not, which we show from a theoretical perspective. We therefore develop two methods for quantifying the degree of association between a genomic region and outcome, both of whose power does not vary as a function of LD structure. One method uses dimension reduction to “filter” redundant information when significant LD exists in the region, while the other, called the summary-statistic test, controls for LD by scaling marker Z-statistics using knowledge of the correlation matrix of markers. An advantage of this latter test is that it does not require the original data, but only their Z-statistics from univariate regressions and an estimate of the correlation structure of markers, and we show how to modify the test to protect the type 1 error rate when the correlation structure of markers is misspecified. We apply these methods to sequence data of oral cleft and compare our results to previously proposed gene tests, in particular permutation-based ones. We evaluate the versatility of the modification of the summary-statistic test since the specification of correlation structure between markers can be inaccurate. CONCLUSION: We find a significant association in the sequence data between the 8q24 region and oral cleft using our dimension reduction approach and a borderline significant association using the summary-statistic based approach. We also implement the summary-statistic test using Z-statistics from an already-published GWAS of Chronic Obstructive Pulmonary Disorder (COPD) and correlation structure obtained from HapMap. We experiment with the modification of this test because the correlation structure is assumed imperfectly known.
format Online
Article
Text
id pubmed-3831057
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-38310572013-11-20 Properties of permutation-based gene tests and controlling type 1 error using a summary statistic based gene test Swanson, David M Blacker, Deborah AlChawa, Taofik Ludwig, Kerstin U Mangold, Elisabeth Lange, Christoph BMC Genet Methodology Article BACKGROUND: The advent of genome-wide association studies has led to many novel disease-SNP associations, opening the door to focused study on their biological underpinnings. Because of the importance of analyzing these associations, numerous statistical methods have been devoted to them. However, fewer methods have attempted to associate entire genes or genomic regions with outcomes, which is potentially more useful knowledge from a biological perspective and those methods currently implemented are often permutation-based. RESULTS: One property of some permutation-based tests is that their power varies as a function of whether significant markers are in regions of linkage disequilibrium (LD) or not, which we show from a theoretical perspective. We therefore develop two methods for quantifying the degree of association between a genomic region and outcome, both of whose power does not vary as a function of LD structure. One method uses dimension reduction to “filter” redundant information when significant LD exists in the region, while the other, called the summary-statistic test, controls for LD by scaling marker Z-statistics using knowledge of the correlation matrix of markers. An advantage of this latter test is that it does not require the original data, but only their Z-statistics from univariate regressions and an estimate of the correlation structure of markers, and we show how to modify the test to protect the type 1 error rate when the correlation structure of markers is misspecified. We apply these methods to sequence data of oral cleft and compare our results to previously proposed gene tests, in particular permutation-based ones. We evaluate the versatility of the modification of the summary-statistic test since the specification of correlation structure between markers can be inaccurate. CONCLUSION: We find a significant association in the sequence data between the 8q24 region and oral cleft using our dimension reduction approach and a borderline significant association using the summary-statistic based approach. We also implement the summary-statistic test using Z-statistics from an already-published GWAS of Chronic Obstructive Pulmonary Disorder (COPD) and correlation structure obtained from HapMap. We experiment with the modification of this test because the correlation structure is assumed imperfectly known. BioMed Central 2013-11-07 /pmc/articles/PMC3831057/ /pubmed/24199751 http://dx.doi.org/10.1186/1471-2156-14-108 Text en Copyright © 2013 Swanson et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Swanson, David M
Blacker, Deborah
AlChawa, Taofik
Ludwig, Kerstin U
Mangold, Elisabeth
Lange, Christoph
Properties of permutation-based gene tests and controlling type 1 error using a summary statistic based gene test
title Properties of permutation-based gene tests and controlling type 1 error using a summary statistic based gene test
title_full Properties of permutation-based gene tests and controlling type 1 error using a summary statistic based gene test
title_fullStr Properties of permutation-based gene tests and controlling type 1 error using a summary statistic based gene test
title_full_unstemmed Properties of permutation-based gene tests and controlling type 1 error using a summary statistic based gene test
title_short Properties of permutation-based gene tests and controlling type 1 error using a summary statistic based gene test
title_sort properties of permutation-based gene tests and controlling type 1 error using a summary statistic based gene test
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3831057/
https://www.ncbi.nlm.nih.gov/pubmed/24199751
http://dx.doi.org/10.1186/1471-2156-14-108
work_keys_str_mv AT swansondavidm propertiesofpermutationbasedgenetestsandcontrollingtype1errorusingasummarystatisticbasedgenetest
AT blackerdeborah propertiesofpermutationbasedgenetestsandcontrollingtype1errorusingasummarystatisticbasedgenetest
AT alchawataofik propertiesofpermutationbasedgenetestsandcontrollingtype1errorusingasummarystatisticbasedgenetest
AT ludwigkerstinu propertiesofpermutationbasedgenetestsandcontrollingtype1errorusingasummarystatisticbasedgenetest
AT mangoldelisabeth propertiesofpermutationbasedgenetestsandcontrollingtype1errorusingasummarystatisticbasedgenetest
AT langechristoph propertiesofpermutationbasedgenetestsandcontrollingtype1errorusingasummarystatisticbasedgenetest