Cargando…

Optimized LOWESS normalization parameter selection for DNA microarray data

BACKGROUND: Microarray data normalization is an important step for obtaining data that are reliable and usable for subsequent analysis. One of the most commonly utilized normalization techniques is the locally weighted scatterplot smoothing (LOWESS) algorithm. However, a much overlooked concern with...

Descripción completa

Detalles Bibliográficos
Autores principales: Berger, John A, Hautaniemi, Sampsa, Järvinen, Anna-Kaarina, Edgren, Henrik, Mitra, Sanjit K, Astola, Jaakko
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2004
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC539276/
https://www.ncbi.nlm.nih.gov/pubmed/15588297
http://dx.doi.org/10.1186/1471-2105-5-194
_version_ 1782122075081670656
author Berger, John A
Hautaniemi, Sampsa
Järvinen, Anna-Kaarina
Edgren, Henrik
Mitra, Sanjit K
Astola, Jaakko
author_facet Berger, John A
Hautaniemi, Sampsa
Järvinen, Anna-Kaarina
Edgren, Henrik
Mitra, Sanjit K
Astola, Jaakko
author_sort Berger, John A
collection PubMed
description BACKGROUND: Microarray data normalization is an important step for obtaining data that are reliable and usable for subsequent analysis. One of the most commonly utilized normalization techniques is the locally weighted scatterplot smoothing (LOWESS) algorithm. However, a much overlooked concern with the LOWESS normalization strategy deals with choosing the appropriate parameters. Parameters are usually chosen arbitrarily, which may reduce the efficiency of the normalization and result in non-optimally normalized data. Thus, there is a need to explore LOWESS parameter selection in greater detail. RESULTS AND DISCUSSION: In this work, we discuss how to choose parameters for the LOWESS method. Moreover, we present an optimization approach for obtaining the fraction of data points utilized in the local regression and analyze results for local print-tip normalization. The optimization procedure determines the bandwidth parameter for the local regression by minimizing a cost function that represents the mean-squared difference between the LOWESS estimates and the normalization reference level. We demonstrate the utility of the systematic parameter selection using two publicly available data sets. The first data set consists of three self versus self hybridizations, which allow for a quantitative study of the optimization method. The second data set contains a collection of DNA microarray data from a breast cancer study utilizing four breast cancer cell lines. Our results show that different parameter choices for the bandwidth window yield dramatically different calibration results in both studies. CONCLUSIONS: Results derived from the self versus self experiment indicate that the proposed optimization approach is a plausible solution for estimating the LOWESS parameters, while results from the breast cancer experiment show that the optimization procedure is readily applicable to real-life microarray data normalization. In summary, the systematic approach to obtain critical parameters in the LOWESS technique is likely to produce data that optimally meets assumptions made in the data preprocessing step and thereby makes studies utilizing the LOWESS method unambiguous and easier to repeat.
format Text
id pubmed-539276
institution National Center for Biotechnology Information
language English
publishDate 2004
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-5392762004-12-26 Optimized LOWESS normalization parameter selection for DNA microarray data Berger, John A Hautaniemi, Sampsa Järvinen, Anna-Kaarina Edgren, Henrik Mitra, Sanjit K Astola, Jaakko BMC Bioinformatics Methodology Article BACKGROUND: Microarray data normalization is an important step for obtaining data that are reliable and usable for subsequent analysis. One of the most commonly utilized normalization techniques is the locally weighted scatterplot smoothing (LOWESS) algorithm. However, a much overlooked concern with the LOWESS normalization strategy deals with choosing the appropriate parameters. Parameters are usually chosen arbitrarily, which may reduce the efficiency of the normalization and result in non-optimally normalized data. Thus, there is a need to explore LOWESS parameter selection in greater detail. RESULTS AND DISCUSSION: In this work, we discuss how to choose parameters for the LOWESS method. Moreover, we present an optimization approach for obtaining the fraction of data points utilized in the local regression and analyze results for local print-tip normalization. The optimization procedure determines the bandwidth parameter for the local regression by minimizing a cost function that represents the mean-squared difference between the LOWESS estimates and the normalization reference level. We demonstrate the utility of the systematic parameter selection using two publicly available data sets. The first data set consists of three self versus self hybridizations, which allow for a quantitative study of the optimization method. The second data set contains a collection of DNA microarray data from a breast cancer study utilizing four breast cancer cell lines. Our results show that different parameter choices for the bandwidth window yield dramatically different calibration results in both studies. CONCLUSIONS: Results derived from the self versus self experiment indicate that the proposed optimization approach is a plausible solution for estimating the LOWESS parameters, while results from the breast cancer experiment show that the optimization procedure is readily applicable to real-life microarray data normalization. In summary, the systematic approach to obtain critical parameters in the LOWESS technique is likely to produce data that optimally meets assumptions made in the data preprocessing step and thereby makes studies utilizing the LOWESS method unambiguous and easier to repeat. BioMed Central 2004-12-09 /pmc/articles/PMC539276/ /pubmed/15588297 http://dx.doi.org/10.1186/1471-2105-5-194 Text en Copyright © 2004 Berger et al; licensee BioMed Central Ltd.
spellingShingle Methodology Article
Berger, John A
Hautaniemi, Sampsa
Järvinen, Anna-Kaarina
Edgren, Henrik
Mitra, Sanjit K
Astola, Jaakko
Optimized LOWESS normalization parameter selection for DNA microarray data
title Optimized LOWESS normalization parameter selection for DNA microarray data
title_full Optimized LOWESS normalization parameter selection for DNA microarray data
title_fullStr Optimized LOWESS normalization parameter selection for DNA microarray data
title_full_unstemmed Optimized LOWESS normalization parameter selection for DNA microarray data
title_short Optimized LOWESS normalization parameter selection for DNA microarray data
title_sort optimized lowess normalization parameter selection for dna microarray data
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC539276/
https://www.ncbi.nlm.nih.gov/pubmed/15588297
http://dx.doi.org/10.1186/1471-2105-5-194
work_keys_str_mv AT bergerjohna optimizedlowessnormalizationparameterselectionfordnamicroarraydata
AT hautaniemisampsa optimizedlowessnormalizationparameterselectionfordnamicroarraydata
AT jarvinenannakaarina optimizedlowessnormalizationparameterselectionfordnamicroarraydata
AT edgrenhenrik optimizedlowessnormalizationparameterselectionfordnamicroarraydata
AT mitrasanjitk optimizedlowessnormalizationparameterselectionfordnamicroarraydata
AT astolajaakko optimizedlowessnormalizationparameterselectionfordnamicroarraydata