Cargando…
Mixture-model based estimation of gene expression variance from public database improves identification of differentially expressed genes in small sized microarray data
Motivation: The small number of samples in many microarray experiments is a challenge for the correct identification of differentially expressed gens (DEGs) by conventional statistical means. Information from public microarray databases can help more efficient identification of DEGs. To model variou...
Autores principales: | , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
Oxford University Press
2010
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2820675/ https://www.ncbi.nlm.nih.gov/pubmed/20015947 http://dx.doi.org/10.1093/bioinformatics/btp685 |
_version_ | 1782177401700089856 |
---|---|
author | Kim, Mingoo Cho, Sung Bum Kim, Ju Han |
author_facet | Kim, Mingoo Cho, Sung Bum Kim, Ju Han |
author_sort | Kim, Mingoo |
collection | PubMed |
description | Motivation: The small number of samples in many microarray experiments is a challenge for the correct identification of differentially expressed gens (DEGs) by conventional statistical means. Information from public microarray databases can help more efficient identification of DEGs. To model various experimental conditions of a public microarray database, we applied Gaussian mixture model and extracted bi- or tri-modal distributions of gene expression. Prior variance of Baldi's Bayesian framework was estimate for the analysis of the small sample-sized datasets. Results: First, we estimated the prior variance of a gene expression by pooling variances obtained from mixture modeling of large samples in the public microarray database. Then, using the prior variance, we identified DEGs in small sample-sized test datasets using the Baldi's framework. For benchmark study, we generated test datasets having several samples from relatively large datasets. Our proposed method outperformed other benchmark methods in terms of detecting gold-standard DEGs from the test datasets. The results may be a challenging evidence for usage of public microarray databases in microarray data analysis. Availability: Supplementary data are available at http://www.snubi.org/publication/MixBayes Contact: juhan@snu.ac.kr |
format | Text |
id | pubmed-2820675 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2010 |
publisher | Oxford University Press |
record_format | MEDLINE/PubMed |
spelling | pubmed-28206752010-02-12 Mixture-model based estimation of gene expression variance from public database improves identification of differentially expressed genes in small sized microarray data Kim, Mingoo Cho, Sung Bum Kim, Ju Han Bioinformatics Original Papers Motivation: The small number of samples in many microarray experiments is a challenge for the correct identification of differentially expressed gens (DEGs) by conventional statistical means. Information from public microarray databases can help more efficient identification of DEGs. To model various experimental conditions of a public microarray database, we applied Gaussian mixture model and extracted bi- or tri-modal distributions of gene expression. Prior variance of Baldi's Bayesian framework was estimate for the analysis of the small sample-sized datasets. Results: First, we estimated the prior variance of a gene expression by pooling variances obtained from mixture modeling of large samples in the public microarray database. Then, using the prior variance, we identified DEGs in small sample-sized test datasets using the Baldi's framework. For benchmark study, we generated test datasets having several samples from relatively large datasets. Our proposed method outperformed other benchmark methods in terms of detecting gold-standard DEGs from the test datasets. The results may be a challenging evidence for usage of public microarray databases in microarray data analysis. Availability: Supplementary data are available at http://www.snubi.org/publication/MixBayes Contact: juhan@snu.ac.kr Oxford University Press 2010-02-15 2009-12-16 /pmc/articles/PMC2820675/ /pubmed/20015947 http://dx.doi.org/10.1093/bioinformatics/btp685 Text en © The Author(s) 2009. Published by Oxford University Press. http://creativecommons.org/licenses/by-nc/2.0/uk/ This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.5), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Original Papers Kim, Mingoo Cho, Sung Bum Kim, Ju Han Mixture-model based estimation of gene expression variance from public database improves identification of differentially expressed genes in small sized microarray data |
title | Mixture-model based estimation of gene expression variance from public database improves identification of differentially expressed genes in small sized microarray data |
title_full | Mixture-model based estimation of gene expression variance from public database improves identification of differentially expressed genes in small sized microarray data |
title_fullStr | Mixture-model based estimation of gene expression variance from public database improves identification of differentially expressed genes in small sized microarray data |
title_full_unstemmed | Mixture-model based estimation of gene expression variance from public database improves identification of differentially expressed genes in small sized microarray data |
title_short | Mixture-model based estimation of gene expression variance from public database improves identification of differentially expressed genes in small sized microarray data |
title_sort | mixture-model based estimation of gene expression variance from public database improves identification of differentially expressed genes in small sized microarray data |
topic | Original Papers |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2820675/ https://www.ncbi.nlm.nih.gov/pubmed/20015947 http://dx.doi.org/10.1093/bioinformatics/btp685 |
work_keys_str_mv | AT kimmingoo mixturemodelbasedestimationofgeneexpressionvariancefrompublicdatabaseimprovesidentificationofdifferentiallyexpressedgenesinsmallsizedmicroarraydata AT chosungbum mixturemodelbasedestimationofgeneexpressionvariancefrompublicdatabaseimprovesidentificationofdifferentiallyexpressedgenesinsmallsizedmicroarraydata AT kimjuhan mixturemodelbasedestimationofgeneexpressionvariancefrompublicdatabaseimprovesidentificationofdifferentiallyexpressedgenesinsmallsizedmicroarraydata |