Cargando…
A Bayesian approach to efficient differential allocation for resampling-based significance testing
BACKGROUND: Large-scale statistical analyses have become hallmarks of post-genomic era biological research due to advances in high-throughput assays and the integration of large biological databases. One accompanying issue is the simultaneous estimation of p-values for a large number of hypothesis t...
Autores principales: | , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2009
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2718927/ https://www.ncbi.nlm.nih.gov/pubmed/19558706 http://dx.doi.org/10.1186/1471-2105-10-198 |
_version_ | 1782170039309303808 |
---|---|
author | Jensen, Shane T Soi, Sameer Wang, Li-San |
author_facet | Jensen, Shane T Soi, Sameer Wang, Li-San |
author_sort | Jensen, Shane T |
collection | PubMed |
description | BACKGROUND: Large-scale statistical analyses have become hallmarks of post-genomic era biological research due to advances in high-throughput assays and the integration of large biological databases. One accompanying issue is the simultaneous estimation of p-values for a large number of hypothesis tests. In many applications, a parametric assumption in the null distribution such as normality may be unreasonable, and resampling-based p-values are the preferred procedure for establishing statistical significance. Using resampling-based procedures for multiple testing is computationally intensive and typically requires large numbers of resamples. RESULTS: We present a new approach to more efficiently assign resamples (such as bootstrap samples or permutations) within a nonparametric multiple testing framework. We formulated a Bayesian-inspired approach to this problem, and devised an algorithm that adapts the assignment of resamples iteratively with negligible space and running time overhead. In two experimental studies, a breast cancer microarray dataset and a genome wide association study dataset for Parkinson's disease, we demonstrated that our differential allocation procedure is substantially more accurate compared to the traditional uniform resample allocation. CONCLUSION: Our experiments demonstrate that using a more sophisticated allocation strategy can improve our inference for hypothesis testing without a drastic increase in the amount of computation on randomized data. Moreover, we gain more improvement in efficiency when the number of tests is large. R code for our algorithm and the shortcut method are available at . |
format | Text |
id | pubmed-2718927 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2009 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-27189272009-07-31 A Bayesian approach to efficient differential allocation for resampling-based significance testing Jensen, Shane T Soi, Sameer Wang, Li-San BMC Bioinformatics Research Article BACKGROUND: Large-scale statistical analyses have become hallmarks of post-genomic era biological research due to advances in high-throughput assays and the integration of large biological databases. One accompanying issue is the simultaneous estimation of p-values for a large number of hypothesis tests. In many applications, a parametric assumption in the null distribution such as normality may be unreasonable, and resampling-based p-values are the preferred procedure for establishing statistical significance. Using resampling-based procedures for multiple testing is computationally intensive and typically requires large numbers of resamples. RESULTS: We present a new approach to more efficiently assign resamples (such as bootstrap samples or permutations) within a nonparametric multiple testing framework. We formulated a Bayesian-inspired approach to this problem, and devised an algorithm that adapts the assignment of resamples iteratively with negligible space and running time overhead. In two experimental studies, a breast cancer microarray dataset and a genome wide association study dataset for Parkinson's disease, we demonstrated that our differential allocation procedure is substantially more accurate compared to the traditional uniform resample allocation. CONCLUSION: Our experiments demonstrate that using a more sophisticated allocation strategy can improve our inference for hypothesis testing without a drastic increase in the amount of computation on randomized data. Moreover, we gain more improvement in efficiency when the number of tests is large. R code for our algorithm and the shortcut method are available at . BioMed Central 2009-06-28 /pmc/articles/PMC2718927/ /pubmed/19558706 http://dx.doi.org/10.1186/1471-2105-10-198 Text en Copyright © 2009 Jensen et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Jensen, Shane T Soi, Sameer Wang, Li-San A Bayesian approach to efficient differential allocation for resampling-based significance testing |
title | A Bayesian approach to efficient differential allocation for resampling-based significance testing |
title_full | A Bayesian approach to efficient differential allocation for resampling-based significance testing |
title_fullStr | A Bayesian approach to efficient differential allocation for resampling-based significance testing |
title_full_unstemmed | A Bayesian approach to efficient differential allocation for resampling-based significance testing |
title_short | A Bayesian approach to efficient differential allocation for resampling-based significance testing |
title_sort | bayesian approach to efficient differential allocation for resampling-based significance testing |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2718927/ https://www.ncbi.nlm.nih.gov/pubmed/19558706 http://dx.doi.org/10.1186/1471-2105-10-198 |
work_keys_str_mv | AT jensenshanet abayesianapproachtoefficientdifferentialallocationforresamplingbasedsignificancetesting AT soisameer abayesianapproachtoefficientdifferentialallocationforresamplingbasedsignificancetesting AT wanglisan abayesianapproachtoefficientdifferentialallocationforresamplingbasedsignificancetesting AT jensenshanet bayesianapproachtoefficientdifferentialallocationforresamplingbasedsignificancetesting AT soisameer bayesianapproachtoefficientdifferentialallocationforresamplingbasedsignificancetesting AT wanglisan bayesianapproachtoefficientdifferentialallocationforresamplingbasedsignificancetesting |