Cargando…

Mixture-model based estimation of gene expression variance from public database improves identification of differentially expressed genes in small sized microarray data

Motivation: The small number of samples in many microarray experiments is a challenge for the correct identification of differentially expressed gens (DEGs) by conventional statistical means. Information from public microarray databases can help more efficient identification of DEGs. To model variou...

Descripción completa

Detalles Bibliográficos
Autores principales: Kim, Mingoo, Cho, Sung Bum, Kim, Ju Han
Formato: Texto
Lenguaje:English
Publicado: Oxford University Press 2010
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2820675/
https://www.ncbi.nlm.nih.gov/pubmed/20015947
http://dx.doi.org/10.1093/bioinformatics/btp685
_version_ 1782177401700089856
author Kim, Mingoo
Cho, Sung Bum
Kim, Ju Han
author_facet Kim, Mingoo
Cho, Sung Bum
Kim, Ju Han
author_sort Kim, Mingoo
collection PubMed
description Motivation: The small number of samples in many microarray experiments is a challenge for the correct identification of differentially expressed gens (DEGs) by conventional statistical means. Information from public microarray databases can help more efficient identification of DEGs. To model various experimental conditions of a public microarray database, we applied Gaussian mixture model and extracted bi- or tri-modal distributions of gene expression. Prior variance of Baldi's Bayesian framework was estimate for the analysis of the small sample-sized datasets. Results: First, we estimated the prior variance of a gene expression by pooling variances obtained from mixture modeling of large samples in the public microarray database. Then, using the prior variance, we identified DEGs in small sample-sized test datasets using the Baldi's framework. For benchmark study, we generated test datasets having several samples from relatively large datasets. Our proposed method outperformed other benchmark methods in terms of detecting gold-standard DEGs from the test datasets. The results may be a challenging evidence for usage of public microarray databases in microarray data analysis. Availability: Supplementary data are available at http://www.snubi.org/publication/MixBayes Contact: juhan@snu.ac.kr
format Text
id pubmed-2820675
institution National Center for Biotechnology Information
language English
publishDate 2010
publisher Oxford University Press
record_format MEDLINE/PubMed
spelling pubmed-28206752010-02-12 Mixture-model based estimation of gene expression variance from public database improves identification of differentially expressed genes in small sized microarray data Kim, Mingoo Cho, Sung Bum Kim, Ju Han Bioinformatics Original Papers Motivation: The small number of samples in many microarray experiments is a challenge for the correct identification of differentially expressed gens (DEGs) by conventional statistical means. Information from public microarray databases can help more efficient identification of DEGs. To model various experimental conditions of a public microarray database, we applied Gaussian mixture model and extracted bi- or tri-modal distributions of gene expression. Prior variance of Baldi's Bayesian framework was estimate for the analysis of the small sample-sized datasets. Results: First, we estimated the prior variance of a gene expression by pooling variances obtained from mixture modeling of large samples in the public microarray database. Then, using the prior variance, we identified DEGs in small sample-sized test datasets using the Baldi's framework. For benchmark study, we generated test datasets having several samples from relatively large datasets. Our proposed method outperformed other benchmark methods in terms of detecting gold-standard DEGs from the test datasets. The results may be a challenging evidence for usage of public microarray databases in microarray data analysis. Availability: Supplementary data are available at http://www.snubi.org/publication/MixBayes Contact: juhan@snu.ac.kr Oxford University Press 2010-02-15 2009-12-16 /pmc/articles/PMC2820675/ /pubmed/20015947 http://dx.doi.org/10.1093/bioinformatics/btp685 Text en © The Author(s) 2009. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Original Papers
Kim, Mingoo
Cho, Sung Bum
Kim, Ju Han
Mixture-model based estimation of gene expression variance from public database improves identification of differentially expressed genes in small sized microarray data
title Mixture-model based estimation of gene expression variance from public database improves identification of differentially expressed genes in small sized microarray data
title_full Mixture-model based estimation of gene expression variance from public database improves identification of differentially expressed genes in small sized microarray data
title_fullStr Mixture-model based estimation of gene expression variance from public database improves identification of differentially expressed genes in small sized microarray data
title_full_unstemmed Mixture-model based estimation of gene expression variance from public database improves identification of differentially expressed genes in small sized microarray data
title_short Mixture-model based estimation of gene expression variance from public database improves identification of differentially expressed genes in small sized microarray data
title_sort mixture-model based estimation of gene expression variance from public database improves identification of differentially expressed genes in small sized microarray data
topic Original Papers
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2820675/
https://www.ncbi.nlm.nih.gov/pubmed/20015947
http://dx.doi.org/10.1093/bioinformatics/btp685
work_keys_str_mv AT kimmingoo mixturemodelbasedestimationofgeneexpressionvariancefrompublicdatabaseimprovesidentificationofdifferentiallyexpressedgenesinsmallsizedmicroarraydata
AT chosungbum mixturemodelbasedestimationofgeneexpressionvariancefrompublicdatabaseimprovesidentificationofdifferentiallyexpressedgenesinsmallsizedmicroarraydata
AT kimjuhan mixturemodelbasedestimationofgeneexpressionvariancefrompublicdatabaseimprovesidentificationofdifferentiallyexpressedgenesinsmallsizedmicroarraydata