Cargando…

Integrative missing value estimation for microarray data

BACKGROUND: Missing value estimation is an important preprocessing step in microarray analysis. Although several methods have been developed to solve this problem, their performance is unsatisfactory for datasets with high rates of missing data, high measurement noise, or limited numbers of samples....

Descripción completa

Detalles Bibliográficos
Autores principales:	Hu, Jianjun, Li, Haifeng, Waterman, Michael S, Zhou, Xianghong Jasmine
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2006
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1622759/ https://www.ncbi.nlm.nih.gov/pubmed/17038176 http://dx.doi.org/10.1186/1471-2105-7-449

_version_	1782130562479161344
author	Hu, Jianjun Li, Haifeng Waterman, Michael S Zhou, Xianghong Jasmine
author_facet	Hu, Jianjun Li, Haifeng Waterman, Michael S Zhou, Xianghong Jasmine
author_sort	Hu, Jianjun
collection	PubMed
description	BACKGROUND: Missing value estimation is an important preprocessing step in microarray analysis. Although several methods have been developed to solve this problem, their performance is unsatisfactory for datasets with high rates of missing data, high measurement noise, or limited numbers of samples. In fact, more than 80% of the time-series datasets in Stanford Microarray Database contain less than eight samples. RESULTS: We present the integrative Missing Value Estimation method (iMISS) by incorporating information from multiple reference microarray datasets to improve missing value estimation. For each gene with missing data, we derive a consistent neighbor-gene list by taking reference data sets into consideration. To determine whether the given reference data sets are sufficiently informative for integration, we use a submatrix imputation approach. Our experiments showed that iMISS can significantly and consistently improve the accuracy of the state-of-the-art Local Least Square (LLS) imputation algorithm by up to 15% improvement in our benchmark tests. CONCLUSION: We demonstrated that the order-statistics-based integrative imputation algorithms can achieve significant improvements over the state-of-the-art missing value estimation approaches such as LLS and is especially good for imputing microarray datasets with a limited number of samples, high rates of missing data, or very noisy measurements. With the rapid accumulation of microarray datasets, the performance of our approach can be further improved by incorporating larger and more appropriate reference datasets.
format	Text
id	pubmed-1622759
institution	National Center for Biotechnology Information
language	English
publishDate	2006
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-16227592006-10-26 Integrative missing value estimation for microarray data Hu, Jianjun Li, Haifeng Waterman, Michael S Zhou, Xianghong Jasmine BMC Bioinformatics Research Article BACKGROUND: Missing value estimation is an important preprocessing step in microarray analysis. Although several methods have been developed to solve this problem, their performance is unsatisfactory for datasets with high rates of missing data, high measurement noise, or limited numbers of samples. In fact, more than 80% of the time-series datasets in Stanford Microarray Database contain less than eight samples. RESULTS: We present the integrative Missing Value Estimation method (iMISS) by incorporating information from multiple reference microarray datasets to improve missing value estimation. For each gene with missing data, we derive a consistent neighbor-gene list by taking reference data sets into consideration. To determine whether the given reference data sets are sufficiently informative for integration, we use a submatrix imputation approach. Our experiments showed that iMISS can significantly and consistently improve the accuracy of the state-of-the-art Local Least Square (LLS) imputation algorithm by up to 15% improvement in our benchmark tests. CONCLUSION: We demonstrated that the order-statistics-based integrative imputation algorithms can achieve significant improvements over the state-of-the-art missing value estimation approaches such as LLS and is especially good for imputing microarray datasets with a limited number of samples, high rates of missing data, or very noisy measurements. With the rapid accumulation of microarray datasets, the performance of our approach can be further improved by incorporating larger and more appropriate reference datasets. BioMed Central 2006-10-12 /pmc/articles/PMC1622759/ /pubmed/17038176 http://dx.doi.org/10.1186/1471-2105-7-449 Text en Copyright © 2006 Hu et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Hu, Jianjun Li, Haifeng Waterman, Michael S Zhou, Xianghong Jasmine Integrative missing value estimation for microarray data
title	Integrative missing value estimation for microarray data
title_full	Integrative missing value estimation for microarray data
title_fullStr	Integrative missing value estimation for microarray data
title_full_unstemmed	Integrative missing value estimation for microarray data
title_short	Integrative missing value estimation for microarray data
title_sort	integrative missing value estimation for microarray data
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1622759/ https://www.ncbi.nlm.nih.gov/pubmed/17038176 http://dx.doi.org/10.1186/1471-2105-7-449
work_keys_str_mv	AT hujianjun integrativemissingvalueestimationformicroarraydata AT lihaifeng integrativemissingvalueestimationformicroarraydata AT watermanmichaels integrativemissingvalueestimationformicroarraydata AT zhouxianghongjasmine integrativemissingvalueestimationformicroarraydata

Integrative missing value estimation for microarray data

Ejemplares similares