Cargando…

A fast algorithm for determining bounds and accurate approximate p-values of the rank product statistic for replicate experiments

BACKGROUND: The rank product method is a powerful statistical technique for identifying differentially expressed molecules in replicated experiments. A critical issue in molecule selection is accurate calculation of the p-value of the rank product statistic to adequately address multiple testing. Bo...

Descripción completa

Detalles Bibliográficos
Autores principales: Heskes, Tom, Eisinga, Rob, Breitling, Rainer
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2014
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4245829/
https://www.ncbi.nlm.nih.gov/pubmed/25413493
http://dx.doi.org/10.1186/s12859-014-0367-1
_version_ 1782346432709132288
author Heskes, Tom
Eisinga, Rob
Breitling, Rainer
author_facet Heskes, Tom
Eisinga, Rob
Breitling, Rainer
author_sort Heskes, Tom
collection PubMed
description BACKGROUND: The rank product method is a powerful statistical technique for identifying differentially expressed molecules in replicated experiments. A critical issue in molecule selection is accurate calculation of the p-value of the rank product statistic to adequately address multiple testing. Both exact calculation and permutation and gamma approximations have been proposed to determine molecule-level significance. These current approaches have serious drawbacks as they are either computationally burdensome or provide inaccurate estimates in the tail of the p-value distribution. RESULTS: We derive strict lower and upper bounds to the exact p-value along with an accurate approximation that can be used to assess the significance of the rank product statistic in a computationally fast manner. The bounds and the proposed approximation are shown to provide far better accuracy over existing approximate methods in determining tail probabilities, with the slightly conservative upper bound protecting against false positives. We illustrate the proposed method in the context of a recently published analysis on transcriptomic profiling performed in blood. CONCLUSIONS: We provide a method to determine upper bounds and accurate approximate p-values of the rank product statistic. The proposed algorithm provides an order of magnitude increase in throughput as compared with current approaches and offers the opportunity to explore new application domains with even larger multiple testing issue. The R code is published in one of the Additional files and is available at http://www.ru.nl/publish/pages/726696/rankprodbounds.zip. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-014-0367-1) contains supplementary material, which is available to authorized users.
format Online
Article
Text
id pubmed-4245829
institution National Center for Biotechnology Information
language English
publishDate 2014
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-42458292014-11-28 A fast algorithm for determining bounds and accurate approximate p-values of the rank product statistic for replicate experiments Heskes, Tom Eisinga, Rob Breitling, Rainer BMC Bioinformatics Methodology Article BACKGROUND: The rank product method is a powerful statistical technique for identifying differentially expressed molecules in replicated experiments. A critical issue in molecule selection is accurate calculation of the p-value of the rank product statistic to adequately address multiple testing. Both exact calculation and permutation and gamma approximations have been proposed to determine molecule-level significance. These current approaches have serious drawbacks as they are either computationally burdensome or provide inaccurate estimates in the tail of the p-value distribution. RESULTS: We derive strict lower and upper bounds to the exact p-value along with an accurate approximation that can be used to assess the significance of the rank product statistic in a computationally fast manner. The bounds and the proposed approximation are shown to provide far better accuracy over existing approximate methods in determining tail probabilities, with the slightly conservative upper bound protecting against false positives. We illustrate the proposed method in the context of a recently published analysis on transcriptomic profiling performed in blood. CONCLUSIONS: We provide a method to determine upper bounds and accurate approximate p-values of the rank product statistic. The proposed algorithm provides an order of magnitude increase in throughput as compared with current approaches and offers the opportunity to explore new application domains with even larger multiple testing issue. The R code is published in one of the Additional files and is available at http://www.ru.nl/publish/pages/726696/rankprodbounds.zip. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-014-0367-1) contains supplementary material, which is available to authorized users. BioMed Central 2014-11-21 /pmc/articles/PMC4245829/ /pubmed/25413493 http://dx.doi.org/10.1186/s12859-014-0367-1 Text en © Heskes et al.; licensee BioMed Central Ltd. 2014 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Methodology Article
Heskes, Tom
Eisinga, Rob
Breitling, Rainer
A fast algorithm for determining bounds and accurate approximate p-values of the rank product statistic for replicate experiments
title A fast algorithm for determining bounds and accurate approximate p-values of the rank product statistic for replicate experiments
title_full A fast algorithm for determining bounds and accurate approximate p-values of the rank product statistic for replicate experiments
title_fullStr A fast algorithm for determining bounds and accurate approximate p-values of the rank product statistic for replicate experiments
title_full_unstemmed A fast algorithm for determining bounds and accurate approximate p-values of the rank product statistic for replicate experiments
title_short A fast algorithm for determining bounds and accurate approximate p-values of the rank product statistic for replicate experiments
title_sort fast algorithm for determining bounds and accurate approximate p-values of the rank product statistic for replicate experiments
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4245829/
https://www.ncbi.nlm.nih.gov/pubmed/25413493
http://dx.doi.org/10.1186/s12859-014-0367-1
work_keys_str_mv AT heskestom afastalgorithmfordeterminingboundsandaccurateapproximatepvaluesoftherankproductstatisticforreplicateexperiments
AT eisingarob afastalgorithmfordeterminingboundsandaccurateapproximatepvaluesoftherankproductstatisticforreplicateexperiments
AT breitlingrainer afastalgorithmfordeterminingboundsandaccurateapproximatepvaluesoftherankproductstatisticforreplicateexperiments
AT heskestom fastalgorithmfordeterminingboundsandaccurateapproximatepvaluesoftherankproductstatisticforreplicateexperiments
AT eisingarob fastalgorithmfordeterminingboundsandaccurateapproximatepvaluesoftherankproductstatisticforreplicateexperiments
AT breitlingrainer fastalgorithmfordeterminingboundsandaccurateapproximatepvaluesoftherankproductstatisticforreplicateexperiments