Cargando…
A fast algorithm for determining bounds and accurate approximate p-values of the rank product statistic for replicate experiments
BACKGROUND: The rank product method is a powerful statistical technique for identifying differentially expressed molecules in replicated experiments. A critical issue in molecule selection is accurate calculation of the p-value of the rank product statistic to adequately address multiple testing. Bo...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2014
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4245829/ https://www.ncbi.nlm.nih.gov/pubmed/25413493 http://dx.doi.org/10.1186/s12859-014-0367-1 |
_version_ | 1782346432709132288 |
---|---|
author | Heskes, Tom Eisinga, Rob Breitling, Rainer |
author_facet | Heskes, Tom Eisinga, Rob Breitling, Rainer |
author_sort | Heskes, Tom |
collection | PubMed |
description | BACKGROUND: The rank product method is a powerful statistical technique for identifying differentially expressed molecules in replicated experiments. A critical issue in molecule selection is accurate calculation of the p-value of the rank product statistic to adequately address multiple testing. Both exact calculation and permutation and gamma approximations have been proposed to determine molecule-level significance. These current approaches have serious drawbacks as they are either computationally burdensome or provide inaccurate estimates in the tail of the p-value distribution. RESULTS: We derive strict lower and upper bounds to the exact p-value along with an accurate approximation that can be used to assess the significance of the rank product statistic in a computationally fast manner. The bounds and the proposed approximation are shown to provide far better accuracy over existing approximate methods in determining tail probabilities, with the slightly conservative upper bound protecting against false positives. We illustrate the proposed method in the context of a recently published analysis on transcriptomic profiling performed in blood. CONCLUSIONS: We provide a method to determine upper bounds and accurate approximate p-values of the rank product statistic. The proposed algorithm provides an order of magnitude increase in throughput as compared with current approaches and offers the opportunity to explore new application domains with even larger multiple testing issue. The R code is published in one of the Additional files and is available at http://www.ru.nl/publish/pages/726696/rankprodbounds.zip. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-014-0367-1) contains supplementary material, which is available to authorized users. |
format | Online Article Text |
id | pubmed-4245829 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2014 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-42458292014-11-28 A fast algorithm for determining bounds and accurate approximate p-values of the rank product statistic for replicate experiments Heskes, Tom Eisinga, Rob Breitling, Rainer BMC Bioinformatics Methodology Article BACKGROUND: The rank product method is a powerful statistical technique for identifying differentially expressed molecules in replicated experiments. A critical issue in molecule selection is accurate calculation of the p-value of the rank product statistic to adequately address multiple testing. Both exact calculation and permutation and gamma approximations have been proposed to determine molecule-level significance. These current approaches have serious drawbacks as they are either computationally burdensome or provide inaccurate estimates in the tail of the p-value distribution. RESULTS: We derive strict lower and upper bounds to the exact p-value along with an accurate approximation that can be used to assess the significance of the rank product statistic in a computationally fast manner. The bounds and the proposed approximation are shown to provide far better accuracy over existing approximate methods in determining tail probabilities, with the slightly conservative upper bound protecting against false positives. We illustrate the proposed method in the context of a recently published analysis on transcriptomic profiling performed in blood. CONCLUSIONS: We provide a method to determine upper bounds and accurate approximate p-values of the rank product statistic. The proposed algorithm provides an order of magnitude increase in throughput as compared with current approaches and offers the opportunity to explore new application domains with even larger multiple testing issue. The R code is published in one of the Additional files and is available at http://www.ru.nl/publish/pages/726696/rankprodbounds.zip. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-014-0367-1) contains supplementary material, which is available to authorized users. BioMed Central 2014-11-21 /pmc/articles/PMC4245829/ /pubmed/25413493 http://dx.doi.org/10.1186/s12859-014-0367-1 Text en © Heskes et al.; licensee BioMed Central Ltd. 2014 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Methodology Article Heskes, Tom Eisinga, Rob Breitling, Rainer A fast algorithm for determining bounds and accurate approximate p-values of the rank product statistic for replicate experiments |
title | A fast algorithm for determining bounds and accurate approximate p-values of the rank product statistic for replicate experiments |
title_full | A fast algorithm for determining bounds and accurate approximate p-values of the rank product statistic for replicate experiments |
title_fullStr | A fast algorithm for determining bounds and accurate approximate p-values of the rank product statistic for replicate experiments |
title_full_unstemmed | A fast algorithm for determining bounds and accurate approximate p-values of the rank product statistic for replicate experiments |
title_short | A fast algorithm for determining bounds and accurate approximate p-values of the rank product statistic for replicate experiments |
title_sort | fast algorithm for determining bounds and accurate approximate p-values of the rank product statistic for replicate experiments |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4245829/ https://www.ncbi.nlm.nih.gov/pubmed/25413493 http://dx.doi.org/10.1186/s12859-014-0367-1 |
work_keys_str_mv | AT heskestom afastalgorithmfordeterminingboundsandaccurateapproximatepvaluesoftherankproductstatisticforreplicateexperiments AT eisingarob afastalgorithmfordeterminingboundsandaccurateapproximatepvaluesoftherankproductstatisticforreplicateexperiments AT breitlingrainer afastalgorithmfordeterminingboundsandaccurateapproximatepvaluesoftherankproductstatisticforreplicateexperiments AT heskestom fastalgorithmfordeterminingboundsandaccurateapproximatepvaluesoftherankproductstatisticforreplicateexperiments AT eisingarob fastalgorithmfordeterminingboundsandaccurateapproximatepvaluesoftherankproductstatisticforreplicateexperiments AT breitlingrainer fastalgorithmfordeterminingboundsandaccurateapproximatepvaluesoftherankproductstatisticforreplicateexperiments |