Cargando…

A meta-data based method for DNA microarray imputation

BACKGROUND: DNA microarray experiments are conducted in logical sets, such as time course profiling after a treatment is applied to the samples, or comparisons of the samples under two or more conditions. Due to cost and design constraints of spotted cDNA microarray experiments, each logical set com...

Descripción completa

Detalles Bibliográficos
Autores principales:	Jörnsten, Rebecka, Ouyang, Ming, Wang, Hui-Yu
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2007
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1852325/ https://www.ncbi.nlm.nih.gov/pubmed/17394658 http://dx.doi.org/10.1186/1471-2105-8-109

_version_	1782133038012956672
author	Jörnsten, Rebecka Ouyang, Ming Wang, Hui-Yu
author_facet	Jörnsten, Rebecka Ouyang, Ming Wang, Hui-Yu
author_sort	Jörnsten, Rebecka
collection	PubMed
description	BACKGROUND: DNA microarray experiments are conducted in logical sets, such as time course profiling after a treatment is applied to the samples, or comparisons of the samples under two or more conditions. Due to cost and design constraints of spotted cDNA microarray experiments, each logical set commonly includes only a small number of replicates per condition. Despite the vast improvement of the microarray technology in recent years, missing values are prevalent. Intuitively, imputation of missing values is best done using many replicates within the same logical set. In practice, there are few replicates and thus reliable imputation within logical sets is difficult. However, it is in the case of few replicates that the presence of missing values, and how they are imputed, can have the most profound impact on the outcome of downstream analyses (e.g. significance analysis and clustering). This study explores the feasibility of imputation across logical sets, using the vast amount of publicly available microarray data to improve imputation reliability in the small sample size setting. RESULTS: We download all cDNA microarray data of Saccharomyces cerevisiae, Arabidopsis thaliana, and Caenorhabditis elegans from the Stanford Microarray Database. Through cross-validation and simulation, we find that, for all three species, our proposed imputation using data from public databases is far superior to imputation within a logical set, sometimes to an astonishing degree. Furthermore, the imputation root mean square error for significant genes is generally a lot less than that of non-significant ones. CONCLUSION: Since downstream analysis of significant genes, such as clustering and network analysis, can be very sensitive to small perturbations of estimated gene effects, it is highly recommended that researchers apply reliable data imputation prior to further analysis. Our method can also be applied to cDNA microarray experiments from other species, provided good reference data are available.
format	Text
id	pubmed-1852325
institution	National Center for Biotechnology Information
language	English
publishDate	2007
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-18523252007-04-17 A meta-data based method for DNA microarray imputation Jörnsten, Rebecka Ouyang, Ming Wang, Hui-Yu BMC Bioinformatics Research Article BACKGROUND: DNA microarray experiments are conducted in logical sets, such as time course profiling after a treatment is applied to the samples, or comparisons of the samples under two or more conditions. Due to cost and design constraints of spotted cDNA microarray experiments, each logical set commonly includes only a small number of replicates per condition. Despite the vast improvement of the microarray technology in recent years, missing values are prevalent. Intuitively, imputation of missing values is best done using many replicates within the same logical set. In practice, there are few replicates and thus reliable imputation within logical sets is difficult. However, it is in the case of few replicates that the presence of missing values, and how they are imputed, can have the most profound impact on the outcome of downstream analyses (e.g. significance analysis and clustering). This study explores the feasibility of imputation across logical sets, using the vast amount of publicly available microarray data to improve imputation reliability in the small sample size setting. RESULTS: We download all cDNA microarray data of Saccharomyces cerevisiae, Arabidopsis thaliana, and Caenorhabditis elegans from the Stanford Microarray Database. Through cross-validation and simulation, we find that, for all three species, our proposed imputation using data from public databases is far superior to imputation within a logical set, sometimes to an astonishing degree. Furthermore, the imputation root mean square error for significant genes is generally a lot less than that of non-significant ones. CONCLUSION: Since downstream analysis of significant genes, such as clustering and network analysis, can be very sensitive to small perturbations of estimated gene effects, it is highly recommended that researchers apply reliable data imputation prior to further analysis. Our method can also be applied to cDNA microarray experiments from other species, provided good reference data are available. BioMed Central 2007-03-29 /pmc/articles/PMC1852325/ /pubmed/17394658 http://dx.doi.org/10.1186/1471-2105-8-109 Text en Copyright © 2007 Jörnsten et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Jörnsten, Rebecka Ouyang, Ming Wang, Hui-Yu A meta-data based method for DNA microarray imputation
title	A meta-data based method for DNA microarray imputation
title_full	A meta-data based method for DNA microarray imputation
title_fullStr	A meta-data based method for DNA microarray imputation
title_full_unstemmed	A meta-data based method for DNA microarray imputation
title_short	A meta-data based method for DNA microarray imputation
title_sort	meta-data based method for dna microarray imputation
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1852325/ https://www.ncbi.nlm.nih.gov/pubmed/17394658 http://dx.doi.org/10.1186/1471-2105-8-109
work_keys_str_mv	AT jornstenrebecka ametadatabasedmethodfordnamicroarrayimputation AT ouyangming ametadatabasedmethodfordnamicroarrayimputation AT wanghuiyu ametadatabasedmethodfordnamicroarrayimputation AT jornstenrebecka metadatabasedmethodfordnamicroarrayimputation AT ouyangming metadatabasedmethodfordnamicroarrayimputation AT wanghuiyu metadatabasedmethodfordnamicroarrayimputation

A meta-data based method for DNA microarray imputation

Ejemplares similares