Cargando…

A Bayesian approach to efficient differential allocation for resampling-based significance testing

BACKGROUND: Large-scale statistical analyses have become hallmarks of post-genomic era biological research due to advances in high-throughput assays and the integration of large biological databases. One accompanying issue is the simultaneous estimation of p-values for a large number of hypothesis t...

Descripción completa

Detalles Bibliográficos
Autores principales: Jensen, Shane T, Soi, Sameer, Wang, Li-San
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2718927/
https://www.ncbi.nlm.nih.gov/pubmed/19558706
http://dx.doi.org/10.1186/1471-2105-10-198
_version_ 1782170039309303808
author Jensen, Shane T
Soi, Sameer
Wang, Li-San
author_facet Jensen, Shane T
Soi, Sameer
Wang, Li-San
author_sort Jensen, Shane T
collection PubMed
description BACKGROUND: Large-scale statistical analyses have become hallmarks of post-genomic era biological research due to advances in high-throughput assays and the integration of large biological databases. One accompanying issue is the simultaneous estimation of p-values for a large number of hypothesis tests. In many applications, a parametric assumption in the null distribution such as normality may be unreasonable, and resampling-based p-values are the preferred procedure for establishing statistical significance. Using resampling-based procedures for multiple testing is computationally intensive and typically requires large numbers of resamples. RESULTS: We present a new approach to more efficiently assign resamples (such as bootstrap samples or permutations) within a nonparametric multiple testing framework. We formulated a Bayesian-inspired approach to this problem, and devised an algorithm that adapts the assignment of resamples iteratively with negligible space and running time overhead. In two experimental studies, a breast cancer microarray dataset and a genome wide association study dataset for Parkinson's disease, we demonstrated that our differential allocation procedure is substantially more accurate compared to the traditional uniform resample allocation. CONCLUSION: Our experiments demonstrate that using a more sophisticated allocation strategy can improve our inference for hypothesis testing without a drastic increase in the amount of computation on randomized data. Moreover, we gain more improvement in efficiency when the number of tests is large. R code for our algorithm and the shortcut method are available at .
format Text
id pubmed-2718927
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-27189272009-07-31 A Bayesian approach to efficient differential allocation for resampling-based significance testing Jensen, Shane T Soi, Sameer Wang, Li-San BMC Bioinformatics Research Article BACKGROUND: Large-scale statistical analyses have become hallmarks of post-genomic era biological research due to advances in high-throughput assays and the integration of large biological databases. One accompanying issue is the simultaneous estimation of p-values for a large number of hypothesis tests. In many applications, a parametric assumption in the null distribution such as normality may be unreasonable, and resampling-based p-values are the preferred procedure for establishing statistical significance. Using resampling-based procedures for multiple testing is computationally intensive and typically requires large numbers of resamples. RESULTS: We present a new approach to more efficiently assign resamples (such as bootstrap samples or permutations) within a nonparametric multiple testing framework. We formulated a Bayesian-inspired approach to this problem, and devised an algorithm that adapts the assignment of resamples iteratively with negligible space and running time overhead. In two experimental studies, a breast cancer microarray dataset and a genome wide association study dataset for Parkinson's disease, we demonstrated that our differential allocation procedure is substantially more accurate compared to the traditional uniform resample allocation. CONCLUSION: Our experiments demonstrate that using a more sophisticated allocation strategy can improve our inference for hypothesis testing without a drastic increase in the amount of computation on randomized data. Moreover, we gain more improvement in efficiency when the number of tests is large. R code for our algorithm and the shortcut method are available at . BioMed Central 2009-06-28 /pmc/articles/PMC2718927/ /pubmed/19558706 http://dx.doi.org/10.1186/1471-2105-10-198 Text en Copyright © 2009 Jensen et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Jensen, Shane T
Soi, Sameer
Wang, Li-San
A Bayesian approach to efficient differential allocation for resampling-based significance testing
title A Bayesian approach to efficient differential allocation for resampling-based significance testing
title_full A Bayesian approach to efficient differential allocation for resampling-based significance testing
title_fullStr A Bayesian approach to efficient differential allocation for resampling-based significance testing
title_full_unstemmed A Bayesian approach to efficient differential allocation for resampling-based significance testing
title_short A Bayesian approach to efficient differential allocation for resampling-based significance testing
title_sort bayesian approach to efficient differential allocation for resampling-based significance testing
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2718927/
https://www.ncbi.nlm.nih.gov/pubmed/19558706
http://dx.doi.org/10.1186/1471-2105-10-198
work_keys_str_mv AT jensenshanet abayesianapproachtoefficientdifferentialallocationforresamplingbasedsignificancetesting
AT soisameer abayesianapproachtoefficientdifferentialallocationforresamplingbasedsignificancetesting
AT wanglisan abayesianapproachtoefficientdifferentialallocationforresamplingbasedsignificancetesting
AT jensenshanet bayesianapproachtoefficientdifferentialallocationforresamplingbasedsignificancetesting
AT soisameer bayesianapproachtoefficientdifferentialallocationforresamplingbasedsignificancetesting
AT wanglisan bayesianapproachtoefficientdifferentialallocationforresamplingbasedsignificancetesting