Cargando…

A fast algorithm for determining bounds and accurate approximate p-values of the rank product statistic for replicate experiments

BACKGROUND: The rank product method is a powerful statistical technique for identifying differentially expressed molecules in replicated experiments. A critical issue in molecule selection is accurate calculation of the p-value of the rank product statistic to adequately address multiple testing. Bo...

Descripción completa

Detalles Bibliográficos
Autores principales:	Heskes, Tom, Eisinga, Rob, Breitling, Rainer
Formato:	Online Artículo Texto
Lenguaje:	English
Publicado:	BioMed Central 2014
Materias:	Methodology Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4245829/ https://www.ncbi.nlm.nih.gov/pubmed/25413493 http://dx.doi.org/10.1186/s12859-014-0367-1

_version_	1782346432709132288
author	Heskes, Tom Eisinga, Rob Breitling, Rainer
author_facet	Heskes, Tom Eisinga, Rob Breitling, Rainer
author_sort	Heskes, Tom
collection	PubMed
description	BACKGROUND: The rank product method is a powerful statistical technique for identifying differentially expressed molecules in replicated experiments. A critical issue in molecule selection is accurate calculation of the p-value of the rank product statistic to adequately address multiple testing. Both exact calculation and permutation and gamma approximations have been proposed to determine molecule-level significance. These current approaches have serious drawbacks as they are either computationally burdensome or provide inaccurate estimates in the tail of the p-value distribution. RESULTS: We derive strict lower and upper bounds to the exact p-value along with an accurate approximation that can be used to assess the significance of the rank product statistic in a computationally fast manner. The bounds and the proposed approximation are shown to provide far better accuracy over existing approximate methods in determining tail probabilities, with the slightly conservative upper bound protecting against false positives. We illustrate the proposed method in the context of a recently published analysis on transcriptomic profiling performed in blood. CONCLUSIONS: We provide a method to determine upper bounds and accurate approximate p-values of the rank product statistic. The proposed algorithm provides an order of magnitude increase in throughput as compared with current approaches and offers the opportunity to explore new application domains with even larger multiple testing issue. The R code is published in one of the Additional files and is available at http://www.ru.nl/publish/pages/726696/rankprodbounds.zip. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-014-0367-1) contains supplementary material, which is available to authorized users.
format	Online Article Text
id	pubmed-4245829
institution	National Center for Biotechnology Information
language	English
publishDate	2014
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-42458292014-11-28 A fast algorithm for determining bounds and accurate approximate p-values of the rank product statistic for replicate experiments Heskes, Tom Eisinga, Rob Breitling, Rainer BMC Bioinformatics Methodology Article BACKGROUND: The rank product method is a powerful statistical technique for identifying differentially expressed molecules in replicated experiments. A critical issue in molecule selection is accurate calculation of the p-value of the rank product statistic to adequately address multiple testing. Both exact calculation and permutation and gamma approximations have been proposed to determine molecule-level significance. These current approaches have serious drawbacks as they are either computationally burdensome or provide inaccurate estimates in the tail of the p-value distribution. RESULTS: We derive strict lower and upper bounds to the exact p-value along with an accurate approximation that can be used to assess the significance of the rank product statistic in a computationally fast manner. The bounds and the proposed approximation are shown to provide far better accuracy over existing approximate methods in determining tail probabilities, with the slightly conservative upper bound protecting against false positives. We illustrate the proposed method in the context of a recently published analysis on transcriptomic profiling performed in blood. CONCLUSIONS: We provide a method to determine upper bounds and accurate approximate p-values of the rank product statistic. The proposed algorithm provides an order of magnitude increase in throughput as compared with current approaches and offers the opportunity to explore new application domains with even larger multiple testing issue. The R code is published in one of the Additional files and is available at http://www.ru.nl/publish/pages/726696/rankprodbounds.zip. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-014-0367-1) contains supplementary material, which is available to authorized users. BioMed Central 2014-11-21 /pmc/articles/PMC4245829/ /pubmed/25413493 http://dx.doi.org/10.1186/s12859-014-0367-1 Text en © Heskes et al.; licensee BioMed Central Ltd. 2014 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle	Methodology Article Heskes, Tom Eisinga, Rob Breitling, Rainer A fast algorithm for determining bounds and accurate approximate p-values of the rank product statistic for replicate experiments
title	A fast algorithm for determining bounds and accurate approximate p-values of the rank product statistic for replicate experiments
title_full	A fast algorithm for determining bounds and accurate approximate p-values of the rank product statistic for replicate experiments
title_fullStr	A fast algorithm for determining bounds and accurate approximate p-values of the rank product statistic for replicate experiments
title_full_unstemmed	A fast algorithm for determining bounds and accurate approximate p-values of the rank product statistic for replicate experiments
title_short	A fast algorithm for determining bounds and accurate approximate p-values of the rank product statistic for replicate experiments
title_sort	fast algorithm for determining bounds and accurate approximate p-values of the rank product statistic for replicate experiments
topic	Methodology Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4245829/ https://www.ncbi.nlm.nih.gov/pubmed/25413493 http://dx.doi.org/10.1186/s12859-014-0367-1
work_keys_str_mv	AT heskestom afastalgorithmfordeterminingboundsandaccurateapproximatepvaluesoftherankproductstatisticforreplicateexperiments AT eisingarob afastalgorithmfordeterminingboundsandaccurateapproximatepvaluesoftherankproductstatisticforreplicateexperiments AT breitlingrainer afastalgorithmfordeterminingboundsandaccurateapproximatepvaluesoftherankproductstatisticforreplicateexperiments AT heskestom fastalgorithmfordeterminingboundsandaccurateapproximatepvaluesoftherankproductstatisticforreplicateexperiments AT eisingarob fastalgorithmfordeterminingboundsandaccurateapproximatepvaluesoftherankproductstatisticforreplicateexperiments AT breitlingrainer fastalgorithmfordeterminingboundsandaccurateapproximatepvaluesoftherankproductstatisticforreplicateexperiments

A fast algorithm for determining bounds and accurate approximate p-values of the rank product statistic for replicate experiments

Ejemplares similares