Cargando…

Unsupervised assessment of microarray data quality using a Gaussian mixture model

BACKGROUND: Quality assessment of microarray data is an important and often challenging aspect of gene expression analysis. This task frequently involves the examination of a variety of summary statistics and diagnostic plots. The interpretation of these diagnostics is often subjective, and generall...

Descripción completa

Detalles Bibliográficos
Autores principales: Howard, Brian E, Sick, Beate, Heber, Steffen
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2717951/
https://www.ncbi.nlm.nih.gov/pubmed/19545436
http://dx.doi.org/10.1186/1471-2105-10-191
_version_ 1782169936608624640
author Howard, Brian E
Sick, Beate
Heber, Steffen
author_facet Howard, Brian E
Sick, Beate
Heber, Steffen
author_sort Howard, Brian E
collection PubMed
description BACKGROUND: Quality assessment of microarray data is an important and often challenging aspect of gene expression analysis. This task frequently involves the examination of a variety of summary statistics and diagnostic plots. The interpretation of these diagnostics is often subjective, and generally requires careful expert scrutiny. RESULTS: We show how an unsupervised classification technique based on the Expectation-Maximization (EM) algorithm and the naïve Bayes model can be used to automate microarray quality assessment. The method is flexible and can be easily adapted to accommodate alternate quality statistics and platforms. We evaluate our approach using Affymetrix 3' gene expression and exon arrays and compare the performance of this method to a similar supervised approach. CONCLUSION: This research illustrates the efficacy of an unsupervised classification approach for the purpose of automated microarray data quality assessment. Since our approach requires only unannotated training data, it is easy to customize and to keep up-to-date as technology evolves. In contrast to other "black box" classification systems, this method also allows for intuitive explanations.
format Text
id pubmed-2717951
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-27179512009-07-30 Unsupervised assessment of microarray data quality using a Gaussian mixture model Howard, Brian E Sick, Beate Heber, Steffen BMC Bioinformatics Research Article BACKGROUND: Quality assessment of microarray data is an important and often challenging aspect of gene expression analysis. This task frequently involves the examination of a variety of summary statistics and diagnostic plots. The interpretation of these diagnostics is often subjective, and generally requires careful expert scrutiny. RESULTS: We show how an unsupervised classification technique based on the Expectation-Maximization (EM) algorithm and the naïve Bayes model can be used to automate microarray quality assessment. The method is flexible and can be easily adapted to accommodate alternate quality statistics and platforms. We evaluate our approach using Affymetrix 3' gene expression and exon arrays and compare the performance of this method to a similar supervised approach. CONCLUSION: This research illustrates the efficacy of an unsupervised classification approach for the purpose of automated microarray data quality assessment. Since our approach requires only unannotated training data, it is easy to customize and to keep up-to-date as technology evolves. In contrast to other "black box" classification systems, this method also allows for intuitive explanations. BioMed Central 2009-06-22 /pmc/articles/PMC2717951/ /pubmed/19545436 http://dx.doi.org/10.1186/1471-2105-10-191 Text en Copyright © 2009 Howard et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Howard, Brian E
Sick, Beate
Heber, Steffen
Unsupervised assessment of microarray data quality using a Gaussian mixture model
title Unsupervised assessment of microarray data quality using a Gaussian mixture model
title_full Unsupervised assessment of microarray data quality using a Gaussian mixture model
title_fullStr Unsupervised assessment of microarray data quality using a Gaussian mixture model
title_full_unstemmed Unsupervised assessment of microarray data quality using a Gaussian mixture model
title_short Unsupervised assessment of microarray data quality using a Gaussian mixture model
title_sort unsupervised assessment of microarray data quality using a gaussian mixture model
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2717951/
https://www.ncbi.nlm.nih.gov/pubmed/19545436
http://dx.doi.org/10.1186/1471-2105-10-191
work_keys_str_mv AT howardbriane unsupervisedassessmentofmicroarraydataqualityusingagaussianmixturemodel
AT sickbeate unsupervisedassessmentofmicroarraydataqualityusingagaussianmixturemodel
AT hebersteffen unsupervisedassessmentofmicroarraydataqualityusingagaussianmixturemodel