Cargando…

Evaluation of normalization methods for cDNA microarray data by k-NN classification

BACKGROUND: Non-biological factors give rise to unwanted variations in cDNA microarray data. There are many normalization methods designed to remove such variations. However, to date there have been few published systematic evaluations of these techniques for removing variations arising from dye bia...

Descripción completa

Detalles Bibliográficos
Autores principales:	Wu, Wei, Xing, Eric P, Myers, Connie, Mian, I Saira, Bissell, Mina J
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2005
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1201132/ https://www.ncbi.nlm.nih.gov/pubmed/16045803 http://dx.doi.org/10.1186/1471-2105-6-191

_version_	1782124888831557632
author	Wu, Wei Xing, Eric P Myers, Connie Mian, I Saira Bissell, Mina J
author_facet	Wu, Wei Xing, Eric P Myers, Connie Mian, I Saira Bissell, Mina J
author_sort	Wu, Wei
collection	PubMed
description	BACKGROUND: Non-biological factors give rise to unwanted variations in cDNA microarray data. There are many normalization methods designed to remove such variations. However, to date there have been few published systematic evaluations of these techniques for removing variations arising from dye biases in the context of downstream, higher-order analytical tasks such as classification. RESULTS: Ten location normalization methods that adjust spatial- and/or intensity-dependent dye biases, and three scale methods that adjust scale differences were applied, individually and in combination, to five distinct, published, cancer biology-related cDNA microarray data sets. Leave-one-out cross-validation (LOOCV) classification error was employed as the quantitative end-point for assessing the effectiveness of a normalization method. In particular, a known classifier, k-nearest neighbor (k-NN), was estimated from data normalized using a given technique, and the LOOCV error rate of the ensuing model was computed. We found that k-NN classifiers are sensitive to dye biases in the data. Using NONRM and GMEDIAN as baseline methods, our results show that single-bias-removal techniques which remove either spatial-dependent dye bias (referred later as spatial effect) or intensity-dependent dye bias (referred later as intensity effect) moderately reduce LOOCV classification errors; whereas double-bias-removal techniques which remove both spatial- and intensity effect reduce LOOCV classification errors even further. Of the 41 different strategies examined, three two-step processes, IGLOESS-SLFILTERW7, ISTSPLINE-SLLOESS and IGLOESS-SLLOESS, all of which removed intensity effect globally and spatial effect locally, appear to reduce LOOCV classification errors most consistently and effectively across all data sets. We also found that the investigated scale normalization methods do not reduce LOOCV classification error. CONCLUSION: Using LOOCV error of k-NNs as the evaluation criterion, three double-bias-removal normalization strategies, IGLOESS-SLFILTERW7, ISTSPLINE-SLLOESS and IGLOESS-SLLOESS, outperform other strategies for removing spatial effect, intensity effect and scale differences from cDNA microarray data. The apparent sensitivity of k-NN LOOCV classification error to dye biases suggests that this criterion provides an informative measure for evaluating normalization methods. All the computational tools used in this study were implemented using the R language for statistical computing and graphics.
format	Text
id	pubmed-1201132
institution	National Center for Biotechnology Information
language	English
publishDate	2005
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-12011322005-09-10 Evaluation of normalization methods for cDNA microarray data by k-NN classification Wu, Wei Xing, Eric P Myers, Connie Mian, I Saira Bissell, Mina J BMC Bioinformatics Research Article BACKGROUND: Non-biological factors give rise to unwanted variations in cDNA microarray data. There are many normalization methods designed to remove such variations. However, to date there have been few published systematic evaluations of these techniques for removing variations arising from dye biases in the context of downstream, higher-order analytical tasks such as classification. RESULTS: Ten location normalization methods that adjust spatial- and/or intensity-dependent dye biases, and three scale methods that adjust scale differences were applied, individually and in combination, to five distinct, published, cancer biology-related cDNA microarray data sets. Leave-one-out cross-validation (LOOCV) classification error was employed as the quantitative end-point for assessing the effectiveness of a normalization method. In particular, a known classifier, k-nearest neighbor (k-NN), was estimated from data normalized using a given technique, and the LOOCV error rate of the ensuing model was computed. We found that k-NN classifiers are sensitive to dye biases in the data. Using NONRM and GMEDIAN as baseline methods, our results show that single-bias-removal techniques which remove either spatial-dependent dye bias (referred later as spatial effect) or intensity-dependent dye bias (referred later as intensity effect) moderately reduce LOOCV classification errors; whereas double-bias-removal techniques which remove both spatial- and intensity effect reduce LOOCV classification errors even further. Of the 41 different strategies examined, three two-step processes, IGLOESS-SLFILTERW7, ISTSPLINE-SLLOESS and IGLOESS-SLLOESS, all of which removed intensity effect globally and spatial effect locally, appear to reduce LOOCV classification errors most consistently and effectively across all data sets. We also found that the investigated scale normalization methods do not reduce LOOCV classification error. CONCLUSION: Using LOOCV error of k-NNs as the evaluation criterion, three double-bias-removal normalization strategies, IGLOESS-SLFILTERW7, ISTSPLINE-SLLOESS and IGLOESS-SLLOESS, outperform other strategies for removing spatial effect, intensity effect and scale differences from cDNA microarray data. The apparent sensitivity of k-NN LOOCV classification error to dye biases suggests that this criterion provides an informative measure for evaluating normalization methods. All the computational tools used in this study were implemented using the R language for statistical computing and graphics. BioMed Central 2005-07-26 /pmc/articles/PMC1201132/ /pubmed/16045803 http://dx.doi.org/10.1186/1471-2105-6-191 Text en Copyright © 2005 Wu et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Wu, Wei Xing, Eric P Myers, Connie Mian, I Saira Bissell, Mina J Evaluation of normalization methods for cDNA microarray data by k-NN classification
title	Evaluation of normalization methods for cDNA microarray data by k-NN classification
title_full	Evaluation of normalization methods for cDNA microarray data by k-NN classification
title_fullStr	Evaluation of normalization methods for cDNA microarray data by k-NN classification
title_full_unstemmed	Evaluation of normalization methods for cDNA microarray data by k-NN classification
title_short	Evaluation of normalization methods for cDNA microarray data by k-NN classification
title_sort	evaluation of normalization methods for cdna microarray data by k-nn classification
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1201132/ https://www.ncbi.nlm.nih.gov/pubmed/16045803 http://dx.doi.org/10.1186/1471-2105-6-191
work_keys_str_mv	AT wuwei evaluationofnormalizationmethodsforcdnamicroarraydatabyknnclassification AT xingericp evaluationofnormalizationmethodsforcdnamicroarraydatabyknnclassification AT myersconnie evaluationofnormalizationmethodsforcdnamicroarraydatabyknnclassification AT mianisaira evaluationofnormalizationmethodsforcdnamicroarraydatabyknnclassification AT bissellminaj evaluationofnormalizationmethodsforcdnamicroarraydatabyknnclassification

Evaluation of normalization methods for cDNA microarray data by k-NN classification

Ejemplares similares