Cargando…

Methodological study of affine transformations of gene expression data with proposed robust non-parametric multi-dimensional normalization method

BACKGROUND: Low-level processing and normalization of microarray data are most important steps in microarray analysis, which have profound impact on downstream analysis. Multiple methods have been suggested to date, but it is not clear which is the best. It is therefore important to further study th...

Descripción completa

Detalles Bibliográficos
Autores principales: Bengtsson, Henrik, Hössjer, Ola
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1534066/
https://www.ncbi.nlm.nih.gov/pubmed/16509971
http://dx.doi.org/10.1186/1471-2105-7-100
_version_ 1782129099125293056
author Bengtsson, Henrik
Hössjer, Ola
author_facet Bengtsson, Henrik
Hössjer, Ola
author_sort Bengtsson, Henrik
collection PubMed
description BACKGROUND: Low-level processing and normalization of microarray data are most important steps in microarray analysis, which have profound impact on downstream analysis. Multiple methods have been suggested to date, but it is not clear which is the best. It is therefore important to further study the different normalization methods in detail and the nature of microarray data in general. RESULTS: A methodological study of affine models for gene expression data is carried out. Focus is on two-channel comparative studies, but the findings generalize also to single- and multi-channel data. The discussion applies to spotted as well as in-situ synthesized microarray data. Existing normalization methods such as curve-fit ("lowess") normalization, parallel and perpendicular translation normalization, and quantile normalization, but also dye-swap normalization are revisited in the light of the affine model and their strengths and weaknesses are investigated in this context. As a direct result from this study, we propose a robust non-parametric multi-dimensional affine normalization method, which can be applied to any number of microarrays with any number of channels either individually or all at once. A high-quality cDNA microarray data set with spike-in controls is used to demonstrate the power of the affine model and the proposed normalization method. CONCLUSION: We find that an affine model can explain non-linear intensity-dependent systematic effects in observed log-ratios. Affine normalization removes such artifacts for non-differentially expressed genes and assures that symmetry between negative and positive log-ratios is obtained, which is fundamental when identifying differentially expressed genes. In addition, affine normalization makes the empirical distributions in different channels more equal, which is the purpose of quantile normalization, and may also explain why dye-swap normalization works or fails. All methods are made available in the aroma package, which is a platform-independent package for R.
format Text
id pubmed-1534066
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-15340662006-08-10 Methodological study of affine transformations of gene expression data with proposed robust non-parametric multi-dimensional normalization method Bengtsson, Henrik Hössjer, Ola BMC Bioinformatics Methodology Article BACKGROUND: Low-level processing and normalization of microarray data are most important steps in microarray analysis, which have profound impact on downstream analysis. Multiple methods have been suggested to date, but it is not clear which is the best. It is therefore important to further study the different normalization methods in detail and the nature of microarray data in general. RESULTS: A methodological study of affine models for gene expression data is carried out. Focus is on two-channel comparative studies, but the findings generalize also to single- and multi-channel data. The discussion applies to spotted as well as in-situ synthesized microarray data. Existing normalization methods such as curve-fit ("lowess") normalization, parallel and perpendicular translation normalization, and quantile normalization, but also dye-swap normalization are revisited in the light of the affine model and their strengths and weaknesses are investigated in this context. As a direct result from this study, we propose a robust non-parametric multi-dimensional affine normalization method, which can be applied to any number of microarrays with any number of channels either individually or all at once. A high-quality cDNA microarray data set with spike-in controls is used to demonstrate the power of the affine model and the proposed normalization method. CONCLUSION: We find that an affine model can explain non-linear intensity-dependent systematic effects in observed log-ratios. Affine normalization removes such artifacts for non-differentially expressed genes and assures that symmetry between negative and positive log-ratios is obtained, which is fundamental when identifying differentially expressed genes. In addition, affine normalization makes the empirical distributions in different channels more equal, which is the purpose of quantile normalization, and may also explain why dye-swap normalization works or fails. All methods are made available in the aroma package, which is a platform-independent package for R. BioMed Central 2006-03-01 /pmc/articles/PMC1534066/ /pubmed/16509971 http://dx.doi.org/10.1186/1471-2105-7-100 Text en Copyright © 2006 Bengtsson and Hössjer; licensee BioMed Central Ltd. https://creativecommons.org/licenses/by/2.0/This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0 (https://creativecommons.org/licenses/by/2.0/) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Bengtsson, Henrik
Hössjer, Ola
Methodological study of affine transformations of gene expression data with proposed robust non-parametric multi-dimensional normalization method
title Methodological study of affine transformations of gene expression data with proposed robust non-parametric multi-dimensional normalization method
title_full Methodological study of affine transformations of gene expression data with proposed robust non-parametric multi-dimensional normalization method
title_fullStr Methodological study of affine transformations of gene expression data with proposed robust non-parametric multi-dimensional normalization method
title_full_unstemmed Methodological study of affine transformations of gene expression data with proposed robust non-parametric multi-dimensional normalization method
title_short Methodological study of affine transformations of gene expression data with proposed robust non-parametric multi-dimensional normalization method
title_sort methodological study of affine transformations of gene expression data with proposed robust non-parametric multi-dimensional normalization method
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1534066/
https://www.ncbi.nlm.nih.gov/pubmed/16509971
http://dx.doi.org/10.1186/1471-2105-7-100
work_keys_str_mv AT bengtssonhenrik methodologicalstudyofaffinetransformationsofgeneexpressiondatawithproposedrobustnonparametricmultidimensionalnormalizationmethod
AT hossjerola methodologicalstudyofaffinetransformationsofgeneexpressiondatawithproposedrobustnonparametricmultidimensionalnormalizationmethod