Cargando…

Bayesian models for pooling microarray studies with multiple sources of replications

BACKGROUND: Biologists often conduct multiple but different cDNA microarray studies that all target the same biological system or pathway. Within each study, replicate slides within repeated identical experiments are often produced. Pooling information across studies can help more accurately identif...

Descripción completa

Detalles Bibliográficos
Autores principales: Conlon, Erin M, Song, Joon J, Liu, Jun S
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1534062/
https://www.ncbi.nlm.nih.gov/pubmed/16677390
http://dx.doi.org/10.1186/1471-2105-7-247
_version_ 1782129098189963264
author Conlon, Erin M
Song, Joon J
Liu, Jun S
author_facet Conlon, Erin M
Song, Joon J
Liu, Jun S
author_sort Conlon, Erin M
collection PubMed
description BACKGROUND: Biologists often conduct multiple but different cDNA microarray studies that all target the same biological system or pathway. Within each study, replicate slides within repeated identical experiments are often produced. Pooling information across studies can help more accurately identify true target genes. Here, we introduce a method to integrate multiple independent studies efficiently. RESULTS: We introduce a Bayesian hierarchical model to pool cDNA microarray data across multiple independent studies to identify highly expressed genes. Each study has multiple sources of variation, i.e. replicate slides within repeated identical experiments. Our model produces the gene-specific posterior probability of differential expression, which provides a direct method for ranking genes, and provides Bayesian estimates of false discovery rates (FDR). In simulations combining two and five independent studies, with fixed FDR levels, we observed large increases in the number of discovered genes in pooled versus individual analyses. When the number of output genes is fixed (e.g., top 100), the pooled model found appreciably more truly differentially expressed genes than the individual studies. We were also able to identify more differentially expressed genes from pooling two independent studies in Bacillus subtilis than from each individual data set. Finally, we observed that in our simulation studies our Bayesian FDR estimates tracked the true FDRs very well. CONCLUSION: Our method provides a cohesive framework for combining multiple but not identical microarray studies with several sources of replication, with data produced from the same platform. We assume that each study contains only two conditions: an experimental and a control sample. We demonstrated our model's suitability for a small number of studies that have been either pre-scaled or have no outliers.
format Text
id pubmed-1534062
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-15340622006-08-10 Bayesian models for pooling microarray studies with multiple sources of replications Conlon, Erin M Song, Joon J Liu, Jun S BMC Bioinformatics Methodology Article BACKGROUND: Biologists often conduct multiple but different cDNA microarray studies that all target the same biological system or pathway. Within each study, replicate slides within repeated identical experiments are often produced. Pooling information across studies can help more accurately identify true target genes. Here, we introduce a method to integrate multiple independent studies efficiently. RESULTS: We introduce a Bayesian hierarchical model to pool cDNA microarray data across multiple independent studies to identify highly expressed genes. Each study has multiple sources of variation, i.e. replicate slides within repeated identical experiments. Our model produces the gene-specific posterior probability of differential expression, which provides a direct method for ranking genes, and provides Bayesian estimates of false discovery rates (FDR). In simulations combining two and five independent studies, with fixed FDR levels, we observed large increases in the number of discovered genes in pooled versus individual analyses. When the number of output genes is fixed (e.g., top 100), the pooled model found appreciably more truly differentially expressed genes than the individual studies. We were also able to identify more differentially expressed genes from pooling two independent studies in Bacillus subtilis than from each individual data set. Finally, we observed that in our simulation studies our Bayesian FDR estimates tracked the true FDRs very well. CONCLUSION: Our method provides a cohesive framework for combining multiple but not identical microarray studies with several sources of replication, with data produced from the same platform. We assume that each study contains only two conditions: an experimental and a control sample. We demonstrated our model's suitability for a small number of studies that have been either pre-scaled or have no outliers. BioMed Central 2006-05-05 /pmc/articles/PMC1534062/ /pubmed/16677390 http://dx.doi.org/10.1186/1471-2105-7-247 Text en Copyright © 2006 Conlon et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Conlon, Erin M
Song, Joon J
Liu, Jun S
Bayesian models for pooling microarray studies with multiple sources of replications
title Bayesian models for pooling microarray studies with multiple sources of replications
title_full Bayesian models for pooling microarray studies with multiple sources of replications
title_fullStr Bayesian models for pooling microarray studies with multiple sources of replications
title_full_unstemmed Bayesian models for pooling microarray studies with multiple sources of replications
title_short Bayesian models for pooling microarray studies with multiple sources of replications
title_sort bayesian models for pooling microarray studies with multiple sources of replications
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1534062/
https://www.ncbi.nlm.nih.gov/pubmed/16677390
http://dx.doi.org/10.1186/1471-2105-7-247
work_keys_str_mv AT conlonerinm bayesianmodelsforpoolingmicroarraystudieswithmultiplesourcesofreplications
AT songjoonj bayesianmodelsforpoolingmicroarraystudieswithmultiplesourcesofreplications
AT liujuns bayesianmodelsforpoolingmicroarraystudieswithmultiplesourcesofreplications