Cargando…

Bayesian optimal discovery procedure for simultaneous significance testing

BACKGROUND: In high throughput screening, such as differential gene expression screening, drug sensitivity screening, and genome-wide RNAi screening, tens of thousands of tests need to be conducted simultaneously. However, the number of replicate measurements per test is extremely small, rarely exce...

Descripción completa

Detalles Bibliográficos
Autores principales: Cao, Jing, Xie, Xian-Jin, Zhang, Song, Whitehurst, Angelique, White, Michael A
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2628883/
https://www.ncbi.nlm.nih.gov/pubmed/19126217
http://dx.doi.org/10.1186/1471-2105-10-5
_version_ 1782163743293046784
author Cao, Jing
Xie, Xian-Jin
Zhang, Song
Whitehurst, Angelique
White, Michael A
author_facet Cao, Jing
Xie, Xian-Jin
Zhang, Song
Whitehurst, Angelique
White, Michael A
author_sort Cao, Jing
collection PubMed
description BACKGROUND: In high throughput screening, such as differential gene expression screening, drug sensitivity screening, and genome-wide RNAi screening, tens of thousands of tests need to be conducted simultaneously. However, the number of replicate measurements per test is extremely small, rarely exceeding 3. Several current approaches demonstrate that test statistics with shrinking variance estimates have more power over the traditional t statistic. RESULTS: We propose a Bayesian hierarchical model to incorporate the shrinkage concept by introducing a mixture structure on variance components. The estimates from the Bayesian model are utilized in the optimal discovery procedure (ODP) proposed by Storey in 2007, which was shown to have optimal performance in multiple significance tests. We compared the performance of the Bayesian ODP with several competing test statistics. CONCLUSION: We have conducted simulation studies with 2 to 6 replicates per gene. We have also included test results from two real datasets. The Bayesian ODP outperforms the other methods in our study, including the original ODP. The advantage of the Bayesian ODP becomes more significant when there are few replicates per test. The improvement over the original ODP is based on the fact that Bayesian model borrows strength across genes in estimating unknown parameters. The proposed approach is efficient in computation due to the conjugate structure of the Bayesian model. The R code (see Additional file 1) to calculate the Bayesian ODP is provided.
format Text
id pubmed-2628883
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-26288832009-01-21 Bayesian optimal discovery procedure for simultaneous significance testing Cao, Jing Xie, Xian-Jin Zhang, Song Whitehurst, Angelique White, Michael A BMC Bioinformatics Methodology Article BACKGROUND: In high throughput screening, such as differential gene expression screening, drug sensitivity screening, and genome-wide RNAi screening, tens of thousands of tests need to be conducted simultaneously. However, the number of replicate measurements per test is extremely small, rarely exceeding 3. Several current approaches demonstrate that test statistics with shrinking variance estimates have more power over the traditional t statistic. RESULTS: We propose a Bayesian hierarchical model to incorporate the shrinkage concept by introducing a mixture structure on variance components. The estimates from the Bayesian model are utilized in the optimal discovery procedure (ODP) proposed by Storey in 2007, which was shown to have optimal performance in multiple significance tests. We compared the performance of the Bayesian ODP with several competing test statistics. CONCLUSION: We have conducted simulation studies with 2 to 6 replicates per gene. We have also included test results from two real datasets. The Bayesian ODP outperforms the other methods in our study, including the original ODP. The advantage of the Bayesian ODP becomes more significant when there are few replicates per test. The improvement over the original ODP is based on the fact that Bayesian model borrows strength across genes in estimating unknown parameters. The proposed approach is efficient in computation due to the conjugate structure of the Bayesian model. The R code (see Additional file 1) to calculate the Bayesian ODP is provided. BioMed Central 2009-01-06 /pmc/articles/PMC2628883/ /pubmed/19126217 http://dx.doi.org/10.1186/1471-2105-10-5 Text en Copyright © 2009 Cao et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Cao, Jing
Xie, Xian-Jin
Zhang, Song
Whitehurst, Angelique
White, Michael A
Bayesian optimal discovery procedure for simultaneous significance testing
title Bayesian optimal discovery procedure for simultaneous significance testing
title_full Bayesian optimal discovery procedure for simultaneous significance testing
title_fullStr Bayesian optimal discovery procedure for simultaneous significance testing
title_full_unstemmed Bayesian optimal discovery procedure for simultaneous significance testing
title_short Bayesian optimal discovery procedure for simultaneous significance testing
title_sort bayesian optimal discovery procedure for simultaneous significance testing
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2628883/
https://www.ncbi.nlm.nih.gov/pubmed/19126217
http://dx.doi.org/10.1186/1471-2105-10-5
work_keys_str_mv AT caojing bayesianoptimaldiscoveryprocedureforsimultaneoussignificancetesting
AT xiexianjin bayesianoptimaldiscoveryprocedureforsimultaneoussignificancetesting
AT zhangsong bayesianoptimaldiscoveryprocedureforsimultaneoussignificancetesting
AT whitehurstangelique bayesianoptimaldiscoveryprocedureforsimultaneoussignificancetesting
AT whitemichaela bayesianoptimaldiscoveryprocedureforsimultaneoussignificancetesting