Cargando…

Can Zipf's law be adapted to normalize microarrays?

BACKGROUND: Normalization is the process of removing non-biological sources of variation between array experiments. Recent investigations of data in gene expression databases for varying organisms and tissues have shown that the majority of expressed genes exhibit a power-law distribution with an ex...

Descripción completa

Detalles Bibliográficos
Autores principales: Lu, Tim, Costello, Christine M, Croucher, Peter JP, Häsler, Robert, Deuschl, Günther, Schreiber, Stefan
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2005
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC555536/
https://www.ncbi.nlm.nih.gov/pubmed/15727680
http://dx.doi.org/10.1186/1471-2105-6-37
_version_ 1782122531940990976
author Lu, Tim
Costello, Christine M
Croucher, Peter JP
Häsler, Robert
Deuschl, Günther
Schreiber, Stefan
author_facet Lu, Tim
Costello, Christine M
Croucher, Peter JP
Häsler, Robert
Deuschl, Günther
Schreiber, Stefan
author_sort Lu, Tim
collection PubMed
description BACKGROUND: Normalization is the process of removing non-biological sources of variation between array experiments. Recent investigations of data in gene expression databases for varying organisms and tissues have shown that the majority of expressed genes exhibit a power-law distribution with an exponent close to -1 (i.e. obey Zipf's law). Based on the observation that our single channel and two channel microarray data sets also followed a power-law distribution, we were motivated to develop a normalization method based on this law, and examine how it compares with existing published techniques. A computationally simple and intuitively appealing technique based on this observation is presented. RESULTS: Using pairwise comparisons using MA plots (log ratio vs. log intensity), we compared this novel method to previously published normalization techniques, namely global normalization to the mean, the quantile method, and a variation on the loess normalization method designed specifically for boutique microarrays. Results indicated that, for single channel microarrays, the quantile method was superior with regard to eliminating intensity-dependent effects (banana curves), but Zipf's law normalization does minimize this effect by rotating the data distribution such that the maximal number of data points lie on the zero of the log ratio axis. For two channel boutique microarrays, the Zipf's law normalizations performed as well as, or better than existing techniques. CONCLUSION: Zipf's law normalization is a useful tool where the Quantile method cannot be applied, as is the case with microarrays containing functionally specific gene sets (boutique arrays).
format Text
id pubmed-555536
institution National Center for Biotechnology Information
language English
publishDate 2005
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-5555362005-03-25 Can Zipf's law be adapted to normalize microarrays? Lu, Tim Costello, Christine M Croucher, Peter JP Häsler, Robert Deuschl, Günther Schreiber, Stefan BMC Bioinformatics Methodology Article BACKGROUND: Normalization is the process of removing non-biological sources of variation between array experiments. Recent investigations of data in gene expression databases for varying organisms and tissues have shown that the majority of expressed genes exhibit a power-law distribution with an exponent close to -1 (i.e. obey Zipf's law). Based on the observation that our single channel and two channel microarray data sets also followed a power-law distribution, we were motivated to develop a normalization method based on this law, and examine how it compares with existing published techniques. A computationally simple and intuitively appealing technique based on this observation is presented. RESULTS: Using pairwise comparisons using MA plots (log ratio vs. log intensity), we compared this novel method to previously published normalization techniques, namely global normalization to the mean, the quantile method, and a variation on the loess normalization method designed specifically for boutique microarrays. Results indicated that, for single channel microarrays, the quantile method was superior with regard to eliminating intensity-dependent effects (banana curves), but Zipf's law normalization does minimize this effect by rotating the data distribution such that the maximal number of data points lie on the zero of the log ratio axis. For two channel boutique microarrays, the Zipf's law normalizations performed as well as, or better than existing techniques. CONCLUSION: Zipf's law normalization is a useful tool where the Quantile method cannot be applied, as is the case with microarrays containing functionally specific gene sets (boutique arrays). BioMed Central 2005-02-23 /pmc/articles/PMC555536/ /pubmed/15727680 http://dx.doi.org/10.1186/1471-2105-6-37 Text en Copyright © 2005 Lu et al; licensee BioMed Central Ltd.
spellingShingle Methodology Article
Lu, Tim
Costello, Christine M
Croucher, Peter JP
Häsler, Robert
Deuschl, Günther
Schreiber, Stefan
Can Zipf's law be adapted to normalize microarrays?
title Can Zipf's law be adapted to normalize microarrays?
title_full Can Zipf's law be adapted to normalize microarrays?
title_fullStr Can Zipf's law be adapted to normalize microarrays?
title_full_unstemmed Can Zipf's law be adapted to normalize microarrays?
title_short Can Zipf's law be adapted to normalize microarrays?
title_sort can zipf's law be adapted to normalize microarrays?
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC555536/
https://www.ncbi.nlm.nih.gov/pubmed/15727680
http://dx.doi.org/10.1186/1471-2105-6-37
work_keys_str_mv AT lutim canzipfslawbeadaptedtonormalizemicroarrays
AT costellochristinem canzipfslawbeadaptedtonormalizemicroarrays
AT croucherpeterjp canzipfslawbeadaptedtonormalizemicroarrays
AT haslerrobert canzipfslawbeadaptedtonormalizemicroarrays
AT deuschlgunther canzipfslawbeadaptedtonormalizemicroarrays
AT schreiberstefan canzipfslawbeadaptedtonormalizemicroarrays