Cargando…

Correction of scaling mismatches in oligonucleotide microarray data

BACKGROUND: Gene expression microarray data is notoriously subject to high signal variability. Moreover, unavoidable variation in the concentration of transcripts applied to microarrays may result in poor scaling of the summarized data which can hamper analytical interpretations. This is especially...

Descripción completa

Detalles Bibliográficos
Autores principales:	Barenco, Martino, Stark, Jaroslav, Brewer, Daniel, Tomescu, Daniela, Callard, Robin, Hubank, Michael
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2006
Materias:	Methodology Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1508160/ https://www.ncbi.nlm.nih.gov/pubmed/16684345 http://dx.doi.org/10.1186/1471-2105-7-251

_version_	1782128436327743488
author	Barenco, Martino Stark, Jaroslav Brewer, Daniel Tomescu, Daniela Callard, Robin Hubank, Michael
author_facet	Barenco, Martino Stark, Jaroslav Brewer, Daniel Tomescu, Daniela Callard, Robin Hubank, Michael
author_sort	Barenco, Martino
collection	PubMed
description	BACKGROUND: Gene expression microarray data is notoriously subject to high signal variability. Moreover, unavoidable variation in the concentration of transcripts applied to microarrays may result in poor scaling of the summarized data which can hamper analytical interpretations. This is especially relevant in a systems biology context, where systematic biases in the signals of particular genes can have severe effects on subsequent analyses. Conventionally it would be necessary to replace the mismatched arrays, but individual time points cannot be rerun and inserted because of experimental variability. It would therefore be necessary to repeat the whole time series experiment, which is both impractical and expensive. RESULTS: We explain how scaling mismatches occur in data summarized by the popular MAS5 (GCOS; Affymetrix) algorithm, and propose a simple recursive algorithm to correct them. Its principle is to identify a set of constant genes and to use this set to rescale the microarray signals. We study the properties of the algorithm using artificially generated data and apply it to experimental data. We show that the set of constant genes it generates can be used to rescale data from other experiments, provided that the underlying system is similar to the original. We also demonstrate, using a simple example, that the method can successfully correct existing imbalancesin the data. CONCLUSION: The set of constant genes obtained for a given experiment can be applied to other experiments, provided the systems studied are sufficiently similar. This type of rescaling is especially relevant in systems biology applications using microarray data.
format	Text
id	pubmed-1508160
institution	National Center for Biotechnology Information
language	English
publishDate	2006
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-15081602006-07-18 Correction of scaling mismatches in oligonucleotide microarray data Barenco, Martino Stark, Jaroslav Brewer, Daniel Tomescu, Daniela Callard, Robin Hubank, Michael BMC Bioinformatics Methodology Article BACKGROUND: Gene expression microarray data is notoriously subject to high signal variability. Moreover, unavoidable variation in the concentration of transcripts applied to microarrays may result in poor scaling of the summarized data which can hamper analytical interpretations. This is especially relevant in a systems biology context, where systematic biases in the signals of particular genes can have severe effects on subsequent analyses. Conventionally it would be necessary to replace the mismatched arrays, but individual time points cannot be rerun and inserted because of experimental variability. It would therefore be necessary to repeat the whole time series experiment, which is both impractical and expensive. RESULTS: We explain how scaling mismatches occur in data summarized by the popular MAS5 (GCOS; Affymetrix) algorithm, and propose a simple recursive algorithm to correct them. Its principle is to identify a set of constant genes and to use this set to rescale the microarray signals. We study the properties of the algorithm using artificially generated data and apply it to experimental data. We show that the set of constant genes it generates can be used to rescale data from other experiments, provided that the underlying system is similar to the original. We also demonstrate, using a simple example, that the method can successfully correct existing imbalancesin the data. CONCLUSION: The set of constant genes obtained for a given experiment can be applied to other experiments, provided the systems studied are sufficiently similar. This type of rescaling is especially relevant in systems biology applications using microarray data. BioMed Central 2006-05-09 /pmc/articles/PMC1508160/ /pubmed/16684345 http://dx.doi.org/10.1186/1471-2105-7-251 Text en Copyright © 2006 Barenco et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Methodology Article Barenco, Martino Stark, Jaroslav Brewer, Daniel Tomescu, Daniela Callard, Robin Hubank, Michael Correction of scaling mismatches in oligonucleotide microarray data
title	Correction of scaling mismatches in oligonucleotide microarray data
title_full	Correction of scaling mismatches in oligonucleotide microarray data
title_fullStr	Correction of scaling mismatches in oligonucleotide microarray data
title_full_unstemmed	Correction of scaling mismatches in oligonucleotide microarray data
title_short	Correction of scaling mismatches in oligonucleotide microarray data
title_sort	correction of scaling mismatches in oligonucleotide microarray data
topic	Methodology Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1508160/ https://www.ncbi.nlm.nih.gov/pubmed/16684345 http://dx.doi.org/10.1186/1471-2105-7-251
work_keys_str_mv	AT barencomartino correctionofscalingmismatchesinoligonucleotidemicroarraydata AT starkjaroslav correctionofscalingmismatchesinoligonucleotidemicroarraydata AT brewerdaniel correctionofscalingmismatchesinoligonucleotidemicroarraydata AT tomescudaniela correctionofscalingmismatchesinoligonucleotidemicroarraydata AT callardrobin correctionofscalingmismatchesinoligonucleotidemicroarraydata AT hubankmichael correctionofscalingmismatchesinoligonucleotidemicroarraydata

Correction of scaling mismatches in oligonucleotide microarray data

Ejemplares similares