Cargando…

ConReg-R: Extrapolative recalibration of the empirical distribution of p-values to improve false discovery rate estimates

BACKGROUND: False discovery rate (FDR) control is commonly accepted as the most appropriate error control in multiple hypothesis testing problems. The accuracy of FDR estimation depends on the accuracy of the estimation of p-values from each test and validity of the underlying assumptions of the dis...

Descripción completa

Detalles Bibliográficos
Autores principales: Li, Juntao, Paramita, Puteri, Choi, Kwok Pui, Karuturi, R Krishna Murthy
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3130718/
https://www.ncbi.nlm.nih.gov/pubmed/21595983
http://dx.doi.org/10.1186/1745-6150-6-27
_version_ 1782207647633637376
author Li, Juntao
Paramita, Puteri
Choi, Kwok Pui
Karuturi, R Krishna Murthy
author_facet Li, Juntao
Paramita, Puteri
Choi, Kwok Pui
Karuturi, R Krishna Murthy
author_sort Li, Juntao
collection PubMed
description BACKGROUND: False discovery rate (FDR) control is commonly accepted as the most appropriate error control in multiple hypothesis testing problems. The accuracy of FDR estimation depends on the accuracy of the estimation of p-values from each test and validity of the underlying assumptions of the distribution. However, in many practical testing problems such as in genomics, the p-values could be under-estimated or over-estimated for many known or unknown reasons. Consequently, FDR estimation would then be influenced and lose its veracity. RESULTS: We propose a new extrapolative method called Constrained Regression Recalibration (ConReg-R) to recalibrate the empirical p-values by modeling their distribution to improve the FDR estimates. Our ConReg-R method is based on the observation that accurately estimated p-values from true null hypotheses follow uniform distribution and the observed distribution of p-values is indeed a mixture of distributions of p-values from true null hypotheses and true alternative hypotheses. Hence, ConReg-R recalibrates the observed p-values so that they exhibit the properties of an ideal empirical p-value distribution. The proportion of true null hypotheses (π(0)) and FDR are estimated after the recalibration. CONCLUSIONS: ConReg-R provides an efficient way to improve the FDR estimates. It only requires the p-values from the tests and avoids permutation of the original test data. We demonstrate that the proposed method significantly improves FDR estimation on several gene expression datasets obtained from microarray and RNA-seq experiments. REVIEWERS: The manuscript was reviewed by Prof. Vladimir Kuznetsov, Prof. Philippe Broet, and Prof. Hongfang Liu (nominated by Prof. Yuriy Gusev).
format Online
Article
Text
id pubmed-3130718
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-31307182011-07-07 ConReg-R: Extrapolative recalibration of the empirical distribution of p-values to improve false discovery rate estimates Li, Juntao Paramita, Puteri Choi, Kwok Pui Karuturi, R Krishna Murthy Biol Direct Research BACKGROUND: False discovery rate (FDR) control is commonly accepted as the most appropriate error control in multiple hypothesis testing problems. The accuracy of FDR estimation depends on the accuracy of the estimation of p-values from each test and validity of the underlying assumptions of the distribution. However, in many practical testing problems such as in genomics, the p-values could be under-estimated or over-estimated for many known or unknown reasons. Consequently, FDR estimation would then be influenced and lose its veracity. RESULTS: We propose a new extrapolative method called Constrained Regression Recalibration (ConReg-R) to recalibrate the empirical p-values by modeling their distribution to improve the FDR estimates. Our ConReg-R method is based on the observation that accurately estimated p-values from true null hypotheses follow uniform distribution and the observed distribution of p-values is indeed a mixture of distributions of p-values from true null hypotheses and true alternative hypotheses. Hence, ConReg-R recalibrates the observed p-values so that they exhibit the properties of an ideal empirical p-value distribution. The proportion of true null hypotheses (π(0)) and FDR are estimated after the recalibration. CONCLUSIONS: ConReg-R provides an efficient way to improve the FDR estimates. It only requires the p-values from the tests and avoids permutation of the original test data. We demonstrate that the proposed method significantly improves FDR estimation on several gene expression datasets obtained from microarray and RNA-seq experiments. REVIEWERS: The manuscript was reviewed by Prof. Vladimir Kuznetsov, Prof. Philippe Broet, and Prof. Hongfang Liu (nominated by Prof. Yuriy Gusev). BioMed Central 2011-05-20 /pmc/articles/PMC3130718/ /pubmed/21595983 http://dx.doi.org/10.1186/1745-6150-6-27 Text en Copyright ©2011 Li et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Li, Juntao
Paramita, Puteri
Choi, Kwok Pui
Karuturi, R Krishna Murthy
ConReg-R: Extrapolative recalibration of the empirical distribution of p-values to improve false discovery rate estimates
title ConReg-R: Extrapolative recalibration of the empirical distribution of p-values to improve false discovery rate estimates
title_full ConReg-R: Extrapolative recalibration of the empirical distribution of p-values to improve false discovery rate estimates
title_fullStr ConReg-R: Extrapolative recalibration of the empirical distribution of p-values to improve false discovery rate estimates
title_full_unstemmed ConReg-R: Extrapolative recalibration of the empirical distribution of p-values to improve false discovery rate estimates
title_short ConReg-R: Extrapolative recalibration of the empirical distribution of p-values to improve false discovery rate estimates
title_sort conreg-r: extrapolative recalibration of the empirical distribution of p-values to improve false discovery rate estimates
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3130718/
https://www.ncbi.nlm.nih.gov/pubmed/21595983
http://dx.doi.org/10.1186/1745-6150-6-27
work_keys_str_mv AT lijuntao conregrextrapolativerecalibrationoftheempiricaldistributionofpvaluestoimprovefalsediscoveryrateestimates
AT paramitaputeri conregrextrapolativerecalibrationoftheempiricaldistributionofpvaluestoimprovefalsediscoveryrateestimates
AT choikwokpui conregrextrapolativerecalibrationoftheempiricaldistributionofpvaluestoimprovefalsediscoveryrateestimates
AT karuturirkrishnamurthy conregrextrapolativerecalibrationoftheempiricaldistributionofpvaluestoimprovefalsediscoveryrateestimates