Cargando…

Computationally efficient permutation-based confidence interval estimation for tail-area FDR

Challenges of satisfying parametric assumptions in genomic settings with thousands or millions of tests have led investigators to combine powerful False Discovery Rate (FDR) approaches with computationally expensive but exact permutation testing. We describe a computationally efficient permutation-b...

Descripción completa

Detalles Bibliográficos
Autores principales: Millstein, Joshua, Volfson, Dmitri
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Frontiers Media S.A. 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3775454/
https://www.ncbi.nlm.nih.gov/pubmed/24062767
http://dx.doi.org/10.3389/fgene.2013.00179
_version_ 1782477385009987584
author Millstein, Joshua
Volfson, Dmitri
author_facet Millstein, Joshua
Volfson, Dmitri
author_sort Millstein, Joshua
collection PubMed
description Challenges of satisfying parametric assumptions in genomic settings with thousands or millions of tests have led investigators to combine powerful False Discovery Rate (FDR) approaches with computationally expensive but exact permutation testing. We describe a computationally efficient permutation-based approach that includes a tractable estimator of the proportion of true null hypotheses, the variance of the log of tail-area FDR, and a confidence interval (CI) estimator, which accounts for the number of permutations conducted and dependencies between tests. The CI estimator applies a binomial distribution and an overdispersion parameter to counts of positive tests. The approach is general with regards to the distribution of the test statistic, it performs favorably in comparison to other approaches, and reliable FDR estimates are demonstrated with as few as 10 permutations. An application of this approach to relate sleep patterns to gene expression patterns in mouse hypothalamus yielded a set of 11 transcripts associated with 24 h REM sleep [FDR = 0.15 (0.08, 0.26)]. Two of the corresponding genes, Sfrp1 and Sfrp4, are involved in wnt signaling and several others, Irf7, Ifit1, Iigp2, and Ifih1, have links to interferon signaling. These genes would have been overlooked had a typical a priori FDR threshold such as 0.05 or 0.1 been applied. The CI provides the flexibility for choosing a significance threshold based on tolerance for false discoveries and precision of the FDR estimate. That is, it frees the investigator to use a more data-driven approach to define significance, such as the minimum estimated FDR, an option that is especially useful for weak effects, often observed in studies of complex diseases.
format Online
Article
Text
id pubmed-3775454
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Frontiers Media S.A.
record_format MEDLINE/PubMed
spelling pubmed-37754542013-09-23 Computationally efficient permutation-based confidence interval estimation for tail-area FDR Millstein, Joshua Volfson, Dmitri Front Genet Genetics Challenges of satisfying parametric assumptions in genomic settings with thousands or millions of tests have led investigators to combine powerful False Discovery Rate (FDR) approaches with computationally expensive but exact permutation testing. We describe a computationally efficient permutation-based approach that includes a tractable estimator of the proportion of true null hypotheses, the variance of the log of tail-area FDR, and a confidence interval (CI) estimator, which accounts for the number of permutations conducted and dependencies between tests. The CI estimator applies a binomial distribution and an overdispersion parameter to counts of positive tests. The approach is general with regards to the distribution of the test statistic, it performs favorably in comparison to other approaches, and reliable FDR estimates are demonstrated with as few as 10 permutations. An application of this approach to relate sleep patterns to gene expression patterns in mouse hypothalamus yielded a set of 11 transcripts associated with 24 h REM sleep [FDR = 0.15 (0.08, 0.26)]. Two of the corresponding genes, Sfrp1 and Sfrp4, are involved in wnt signaling and several others, Irf7, Ifit1, Iigp2, and Ifih1, have links to interferon signaling. These genes would have been overlooked had a typical a priori FDR threshold such as 0.05 or 0.1 been applied. The CI provides the flexibility for choosing a significance threshold based on tolerance for false discoveries and precision of the FDR estimate. That is, it frees the investigator to use a more data-driven approach to define significance, such as the minimum estimated FDR, an option that is especially useful for weak effects, often observed in studies of complex diseases. Frontiers Media S.A. 2013-09-17 /pmc/articles/PMC3775454/ /pubmed/24062767 http://dx.doi.org/10.3389/fgene.2013.00179 Text en Copyright © 2013 Millstein and Volfson. http://creativecommons.org/licenses/by/3.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.
spellingShingle Genetics
Millstein, Joshua
Volfson, Dmitri
Computationally efficient permutation-based confidence interval estimation for tail-area FDR
title Computationally efficient permutation-based confidence interval estimation for tail-area FDR
title_full Computationally efficient permutation-based confidence interval estimation for tail-area FDR
title_fullStr Computationally efficient permutation-based confidence interval estimation for tail-area FDR
title_full_unstemmed Computationally efficient permutation-based confidence interval estimation for tail-area FDR
title_short Computationally efficient permutation-based confidence interval estimation for tail-area FDR
title_sort computationally efficient permutation-based confidence interval estimation for tail-area fdr
topic Genetics
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3775454/
https://www.ncbi.nlm.nih.gov/pubmed/24062767
http://dx.doi.org/10.3389/fgene.2013.00179
work_keys_str_mv AT millsteinjoshua computationallyefficientpermutationbasedconfidenceintervalestimationfortailareafdr
AT volfsondmitri computationallyefficientpermutationbasedconfidenceintervalestimationfortailareafdr