Cargando…

Pre-processing Agilent microarray data

BACKGROUND: Pre-processing methods for two-sample long oligonucleotide arrays, specifically the Agilent technology, have not been extensively studied. The goal of this study is to quantify some of the sources of error that affect measurement of expression using Agilent arrays and to compare Agilent&...

Descripción completa

Detalles Bibliográficos
Autores principales:	Zahurak, Marianna, Parmigiani, Giovanni, Yu, Wayne, Scharpf, Robert B, Berman, David, Schaeffer, Edward, Shabbeer, Shabana, Cope, Leslie
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2007
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1876252/ https://www.ncbi.nlm.nih.gov/pubmed/17472750 http://dx.doi.org/10.1186/1471-2105-8-142

_version_	1782133521565876224
author	Zahurak, Marianna Parmigiani, Giovanni Yu, Wayne Scharpf, Robert B Berman, David Schaeffer, Edward Shabbeer, Shabana Cope, Leslie
author_facet	Zahurak, Marianna Parmigiani, Giovanni Yu, Wayne Scharpf, Robert B Berman, David Schaeffer, Edward Shabbeer, Shabana Cope, Leslie
author_sort	Zahurak, Marianna
collection	PubMed
description	BACKGROUND: Pre-processing methods for two-sample long oligonucleotide arrays, specifically the Agilent technology, have not been extensively studied. The goal of this study is to quantify some of the sources of error that affect measurement of expression using Agilent arrays and to compare Agilent's Feature Extraction software with pre-processing methods that have become the standard for normalization of cDNA arrays. These include log transformation followed by loess normalization with or without background subtraction and often a between array scale normalization procedure. The larger goal is to define best study design and pre-processing practices for Agilent arrays, and we offer some suggestions. RESULTS: Simple loess normalization without background subtraction produced the lowest variability. However, without background subtraction, fold changes were biased towards zero, particularly at low intensities. ROC analysis of a spike-in experiment showed that differentially expressed genes are most reliably detected when background is not subtracted. Loess normalization and no background subtraction yielded an AUC of 99.7% compared with 88.8% for Agilent processed fold changes. All methods performed well when error was taken into account by t- or z-statistics, AUCs ≥ 99.8%. A substantial proportion of genes showed dye effects, 43% (99%CI : 39%, 47%). However, these effects were generally small regardless of the pre-processing method. CONCLUSION: Simple loess normalization without background subtraction resulted in low variance fold changes that more reliably ranked gene expression than the other methods. While t-statistics and other measures that take variation into account, including Agilent's z-statistic, can also be used to reliably select differentially expressed genes, fold changes are a standard measure of differential expression for exploratory work, cross platform comparison, and biological interpretation and can not be entirely replaced. Although dye effects are small for most genes, many array features are affected. Therefore, an experimental design that incorporates dye swaps or a common reference could be valuable.
format	Text
id	pubmed-1876252
institution	National Center for Biotechnology Information
language	English
publishDate	2007
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-18762522007-05-22 Pre-processing Agilent microarray data Zahurak, Marianna Parmigiani, Giovanni Yu, Wayne Scharpf, Robert B Berman, David Schaeffer, Edward Shabbeer, Shabana Cope, Leslie BMC Bioinformatics Research Article BACKGROUND: Pre-processing methods for two-sample long oligonucleotide arrays, specifically the Agilent technology, have not been extensively studied. The goal of this study is to quantify some of the sources of error that affect measurement of expression using Agilent arrays and to compare Agilent's Feature Extraction software with pre-processing methods that have become the standard for normalization of cDNA arrays. These include log transformation followed by loess normalization with or without background subtraction and often a between array scale normalization procedure. The larger goal is to define best study design and pre-processing practices for Agilent arrays, and we offer some suggestions. RESULTS: Simple loess normalization without background subtraction produced the lowest variability. However, without background subtraction, fold changes were biased towards zero, particularly at low intensities. ROC analysis of a spike-in experiment showed that differentially expressed genes are most reliably detected when background is not subtracted. Loess normalization and no background subtraction yielded an AUC of 99.7% compared with 88.8% for Agilent processed fold changes. All methods performed well when error was taken into account by t- or z-statistics, AUCs ≥ 99.8%. A substantial proportion of genes showed dye effects, 43% (99%CI : 39%, 47%). However, these effects were generally small regardless of the pre-processing method. CONCLUSION: Simple loess normalization without background subtraction resulted in low variance fold changes that more reliably ranked gene expression than the other methods. While t-statistics and other measures that take variation into account, including Agilent's z-statistic, can also be used to reliably select differentially expressed genes, fold changes are a standard measure of differential expression for exploratory work, cross platform comparison, and biological interpretation and can not be entirely replaced. Although dye effects are small for most genes, many array features are affected. Therefore, an experimental design that incorporates dye swaps or a common reference could be valuable. BioMed Central 2007-05-01 /pmc/articles/PMC1876252/ /pubmed/17472750 http://dx.doi.org/10.1186/1471-2105-8-142 Text en Copyright © 2007 Zahurak et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Zahurak, Marianna Parmigiani, Giovanni Yu, Wayne Scharpf, Robert B Berman, David Schaeffer, Edward Shabbeer, Shabana Cope, Leslie Pre-processing Agilent microarray data
title	Pre-processing Agilent microarray data
title_full	Pre-processing Agilent microarray data
title_fullStr	Pre-processing Agilent microarray data
title_full_unstemmed	Pre-processing Agilent microarray data
title_short	Pre-processing Agilent microarray data
title_sort	pre-processing agilent microarray data
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1876252/ https://www.ncbi.nlm.nih.gov/pubmed/17472750 http://dx.doi.org/10.1186/1471-2105-8-142
work_keys_str_mv	AT zahurakmarianna preprocessingagilentmicroarraydata AT parmigianigiovanni preprocessingagilentmicroarraydata AT yuwayne preprocessingagilentmicroarraydata AT scharpfrobertb preprocessingagilentmicroarraydata AT bermandavid preprocessingagilentmicroarraydata AT schaefferedward preprocessingagilentmicroarraydata AT shabbeershabana preprocessingagilentmicroarraydata AT copeleslie preprocessingagilentmicroarraydata

Pre-processing Agilent microarray data

Ejemplares similares