Cargando…

Use of normalization methods for analysis of microarrays containing a high degree of gene effects

BACKGROUND: High-throughput microarrays are widely used to study gene expression across tissues and developmental stages. Analysis of gene expression data is challenging in these experiments due to the presence of significant percentages of differentially expressed genes (DEG) observed between tissu...

Descripción completa

Detalles Bibliográficos
Autores principales: Ni, Terri T, Lemon, William J, Shyr, Yu, Zhong, Tao P
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2612699/
https://www.ncbi.nlm.nih.gov/pubmed/19040742
http://dx.doi.org/10.1186/1471-2105-9-505
_version_ 1782163135890718720
author Ni, Terri T
Lemon, William J
Shyr, Yu
Zhong, Tao P
author_facet Ni, Terri T
Lemon, William J
Shyr, Yu
Zhong, Tao P
author_sort Ni, Terri T
collection PubMed
description BACKGROUND: High-throughput microarrays are widely used to study gene expression across tissues and developmental stages. Analysis of gene expression data is challenging in these experiments due to the presence of significant percentages of differentially expressed genes (DEG) observed between tissues and developmental stages. Data normalization methods that are widely used today are not designed for data with a large proportion of tissue or gene effects. RESULTS: In our current study, we describe a novel two-dimensional nonparametric normalization method for analyzing microarray data which functions well in the absence or presence of large numbers of gene effects. Rather than relying on an assumption of low variability among most genes, the method implements a unique peak selection strategy to distinguish DEG from genes that are invariant in expression, prior to nonlinear curve fitting. We compared the method under simulated and experimental conditions with five alternative nonlinear normalization approaches: quantile, lowess, robust lowess, invariant set, and cross-correlation (Xcorr). Simulations included various percentages of simulated DEG and the experimental data used is from publicly available datasets known to be difficult to analyze due to the presence of approximately 34% DEG. CONCLUSION: We have demonstrated that the new method provides considerable improvement in the accuracy of data normalization when large proportions of gene effects are present. The performance improvement is mostly attributed to its variable selection component, which is designed to separate expression invariant genes from DEG. Adding this key component of the new method to alternative normalization approaches rescues the most of the sensitivity of these methods to gene effects. The results indicate that our method may be used without prior knowledge of or assumptions about housekeeping genes to normalize microarrays that are quite different.
format Text
id pubmed-2612699
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-26126992009-01-12 Use of normalization methods for analysis of microarrays containing a high degree of gene effects Ni, Terri T Lemon, William J Shyr, Yu Zhong, Tao P BMC Bioinformatics Methodology Article BACKGROUND: High-throughput microarrays are widely used to study gene expression across tissues and developmental stages. Analysis of gene expression data is challenging in these experiments due to the presence of significant percentages of differentially expressed genes (DEG) observed between tissues and developmental stages. Data normalization methods that are widely used today are not designed for data with a large proportion of tissue or gene effects. RESULTS: In our current study, we describe a novel two-dimensional nonparametric normalization method for analyzing microarray data which functions well in the absence or presence of large numbers of gene effects. Rather than relying on an assumption of low variability among most genes, the method implements a unique peak selection strategy to distinguish DEG from genes that are invariant in expression, prior to nonlinear curve fitting. We compared the method under simulated and experimental conditions with five alternative nonlinear normalization approaches: quantile, lowess, robust lowess, invariant set, and cross-correlation (Xcorr). Simulations included various percentages of simulated DEG and the experimental data used is from publicly available datasets known to be difficult to analyze due to the presence of approximately 34% DEG. CONCLUSION: We have demonstrated that the new method provides considerable improvement in the accuracy of data normalization when large proportions of gene effects are present. The performance improvement is mostly attributed to its variable selection component, which is designed to separate expression invariant genes from DEG. Adding this key component of the new method to alternative normalization approaches rescues the most of the sensitivity of these methods to gene effects. The results indicate that our method may be used without prior knowledge of or assumptions about housekeeping genes to normalize microarrays that are quite different. BioMed Central 2008-11-28 /pmc/articles/PMC2612699/ /pubmed/19040742 http://dx.doi.org/10.1186/1471-2105-9-505 Text en Copyright © 2008 Ni et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Ni, Terri T
Lemon, William J
Shyr, Yu
Zhong, Tao P
Use of normalization methods for analysis of microarrays containing a high degree of gene effects
title Use of normalization methods for analysis of microarrays containing a high degree of gene effects
title_full Use of normalization methods for analysis of microarrays containing a high degree of gene effects
title_fullStr Use of normalization methods for analysis of microarrays containing a high degree of gene effects
title_full_unstemmed Use of normalization methods for analysis of microarrays containing a high degree of gene effects
title_short Use of normalization methods for analysis of microarrays containing a high degree of gene effects
title_sort use of normalization methods for analysis of microarrays containing a high degree of gene effects
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2612699/
https://www.ncbi.nlm.nih.gov/pubmed/19040742
http://dx.doi.org/10.1186/1471-2105-9-505
work_keys_str_mv AT niterrit useofnormalizationmethodsforanalysisofmicroarrayscontainingahighdegreeofgeneeffects
AT lemonwilliamj useofnormalizationmethodsforanalysisofmicroarrayscontainingahighdegreeofgeneeffects
AT shyryu useofnormalizationmethodsforanalysisofmicroarrayscontainingahighdegreeofgeneeffects
AT zhongtaop useofnormalizationmethodsforanalysisofmicroarrayscontainingahighdegreeofgeneeffects