Cargando…
Computationally efficient permutation-based confidence interval estimation for tail-area FDR
Challenges of satisfying parametric assumptions in genomic settings with thousands or millions of tests have led investigators to combine powerful False Discovery Rate (FDR) approaches with computationally expensive but exact permutation testing. We describe a computationally efficient permutation-b...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Frontiers Media S.A.
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3775454/ https://www.ncbi.nlm.nih.gov/pubmed/24062767 http://dx.doi.org/10.3389/fgene.2013.00179 |
_version_ | 1782477385009987584 |
---|---|
author | Millstein, Joshua Volfson, Dmitri |
author_facet | Millstein, Joshua Volfson, Dmitri |
author_sort | Millstein, Joshua |
collection | PubMed |
description | Challenges of satisfying parametric assumptions in genomic settings with thousands or millions of tests have led investigators to combine powerful False Discovery Rate (FDR) approaches with computationally expensive but exact permutation testing. We describe a computationally efficient permutation-based approach that includes a tractable estimator of the proportion of true null hypotheses, the variance of the log of tail-area FDR, and a confidence interval (CI) estimator, which accounts for the number of permutations conducted and dependencies between tests. The CI estimator applies a binomial distribution and an overdispersion parameter to counts of positive tests. The approach is general with regards to the distribution of the test statistic, it performs favorably in comparison to other approaches, and reliable FDR estimates are demonstrated with as few as 10 permutations. An application of this approach to relate sleep patterns to gene expression patterns in mouse hypothalamus yielded a set of 11 transcripts associated with 24 h REM sleep [FDR = 0.15 (0.08, 0.26)]. Two of the corresponding genes, Sfrp1 and Sfrp4, are involved in wnt signaling and several others, Irf7, Ifit1, Iigp2, and Ifih1, have links to interferon signaling. These genes would have been overlooked had a typical a priori FDR threshold such as 0.05 or 0.1 been applied. The CI provides the flexibility for choosing a significance threshold based on tolerance for false discoveries and precision of the FDR estimate. That is, it frees the investigator to use a more data-driven approach to define significance, such as the minimum estimated FDR, an option that is especially useful for weak effects, often observed in studies of complex diseases. |
format | Online Article Text |
id | pubmed-3775454 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Frontiers Media S.A. |
record_format | MEDLINE/PubMed |
spelling | pubmed-37754542013-09-23 Computationally efficient permutation-based confidence interval estimation for tail-area FDR Millstein, Joshua Volfson, Dmitri Front Genet Genetics Challenges of satisfying parametric assumptions in genomic settings with thousands or millions of tests have led investigators to combine powerful False Discovery Rate (FDR) approaches with computationally expensive but exact permutation testing. We describe a computationally efficient permutation-based approach that includes a tractable estimator of the proportion of true null hypotheses, the variance of the log of tail-area FDR, and a confidence interval (CI) estimator, which accounts for the number of permutations conducted and dependencies between tests. The CI estimator applies a binomial distribution and an overdispersion parameter to counts of positive tests. The approach is general with regards to the distribution of the test statistic, it performs favorably in comparison to other approaches, and reliable FDR estimates are demonstrated with as few as 10 permutations. An application of this approach to relate sleep patterns to gene expression patterns in mouse hypothalamus yielded a set of 11 transcripts associated with 24 h REM sleep [FDR = 0.15 (0.08, 0.26)]. Two of the corresponding genes, Sfrp1 and Sfrp4, are involved in wnt signaling and several others, Irf7, Ifit1, Iigp2, and Ifih1, have links to interferon signaling. These genes would have been overlooked had a typical a priori FDR threshold such as 0.05 or 0.1 been applied. The CI provides the flexibility for choosing a significance threshold based on tolerance for false discoveries and precision of the FDR estimate. That is, it frees the investigator to use a more data-driven approach to define significance, such as the minimum estimated FDR, an option that is especially useful for weak effects, often observed in studies of complex diseases. Frontiers Media S.A. 2013-09-17 /pmc/articles/PMC3775454/ /pubmed/24062767 http://dx.doi.org/10.3389/fgene.2013.00179 Text en Copyright © 2013 Millstein and Volfson. http://creativecommons.org/licenses/by/3.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms. |
spellingShingle | Genetics Millstein, Joshua Volfson, Dmitri Computationally efficient permutation-based confidence interval estimation for tail-area FDR |
title | Computationally efficient permutation-based confidence interval estimation for tail-area FDR |
title_full | Computationally efficient permutation-based confidence interval estimation for tail-area FDR |
title_fullStr | Computationally efficient permutation-based confidence interval estimation for tail-area FDR |
title_full_unstemmed | Computationally efficient permutation-based confidence interval estimation for tail-area FDR |
title_short | Computationally efficient permutation-based confidence interval estimation for tail-area FDR |
title_sort | computationally efficient permutation-based confidence interval estimation for tail-area fdr |
topic | Genetics |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3775454/ https://www.ncbi.nlm.nih.gov/pubmed/24062767 http://dx.doi.org/10.3389/fgene.2013.00179 |
work_keys_str_mv | AT millsteinjoshua computationallyefficientpermutationbasedconfidenceintervalestimationfortailareafdr AT volfsondmitri computationallyefficientpermutationbasedconfidenceintervalestimationfortailareafdr |