Cargando…
Use of normalization methods for analysis of microarrays containing a high degree of gene effects
BACKGROUND: High-throughput microarrays are widely used to study gene expression across tissues and developmental stages. Analysis of gene expression data is challenging in these experiments due to the presence of significant percentages of differentially expressed genes (DEG) observed between tissu...
Autores principales: | , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2008
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2612699/ https://www.ncbi.nlm.nih.gov/pubmed/19040742 http://dx.doi.org/10.1186/1471-2105-9-505 |
_version_ | 1782163135890718720 |
---|---|
author | Ni, Terri T Lemon, William J Shyr, Yu Zhong, Tao P |
author_facet | Ni, Terri T Lemon, William J Shyr, Yu Zhong, Tao P |
author_sort | Ni, Terri T |
collection | PubMed |
description | BACKGROUND: High-throughput microarrays are widely used to study gene expression across tissues and developmental stages. Analysis of gene expression data is challenging in these experiments due to the presence of significant percentages of differentially expressed genes (DEG) observed between tissues and developmental stages. Data normalization methods that are widely used today are not designed for data with a large proportion of tissue or gene effects. RESULTS: In our current study, we describe a novel two-dimensional nonparametric normalization method for analyzing microarray data which functions well in the absence or presence of large numbers of gene effects. Rather than relying on an assumption of low variability among most genes, the method implements a unique peak selection strategy to distinguish DEG from genes that are invariant in expression, prior to nonlinear curve fitting. We compared the method under simulated and experimental conditions with five alternative nonlinear normalization approaches: quantile, lowess, robust lowess, invariant set, and cross-correlation (Xcorr). Simulations included various percentages of simulated DEG and the experimental data used is from publicly available datasets known to be difficult to analyze due to the presence of approximately 34% DEG. CONCLUSION: We have demonstrated that the new method provides considerable improvement in the accuracy of data normalization when large proportions of gene effects are present. The performance improvement is mostly attributed to its variable selection component, which is designed to separate expression invariant genes from DEG. Adding this key component of the new method to alternative normalization approaches rescues the most of the sensitivity of these methods to gene effects. The results indicate that our method may be used without prior knowledge of or assumptions about housekeeping genes to normalize microarrays that are quite different. |
format | Text |
id | pubmed-2612699 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2008 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-26126992009-01-12 Use of normalization methods for analysis of microarrays containing a high degree of gene effects Ni, Terri T Lemon, William J Shyr, Yu Zhong, Tao P BMC Bioinformatics Methodology Article BACKGROUND: High-throughput microarrays are widely used to study gene expression across tissues and developmental stages. Analysis of gene expression data is challenging in these experiments due to the presence of significant percentages of differentially expressed genes (DEG) observed between tissues and developmental stages. Data normalization methods that are widely used today are not designed for data with a large proportion of tissue or gene effects. RESULTS: In our current study, we describe a novel two-dimensional nonparametric normalization method for analyzing microarray data which functions well in the absence or presence of large numbers of gene effects. Rather than relying on an assumption of low variability among most genes, the method implements a unique peak selection strategy to distinguish DEG from genes that are invariant in expression, prior to nonlinear curve fitting. We compared the method under simulated and experimental conditions with five alternative nonlinear normalization approaches: quantile, lowess, robust lowess, invariant set, and cross-correlation (Xcorr). Simulations included various percentages of simulated DEG and the experimental data used is from publicly available datasets known to be difficult to analyze due to the presence of approximately 34% DEG. CONCLUSION: We have demonstrated that the new method provides considerable improvement in the accuracy of data normalization when large proportions of gene effects are present. The performance improvement is mostly attributed to its variable selection component, which is designed to separate expression invariant genes from DEG. Adding this key component of the new method to alternative normalization approaches rescues the most of the sensitivity of these methods to gene effects. The results indicate that our method may be used without prior knowledge of or assumptions about housekeeping genes to normalize microarrays that are quite different. BioMed Central 2008-11-28 /pmc/articles/PMC2612699/ /pubmed/19040742 http://dx.doi.org/10.1186/1471-2105-9-505 Text en Copyright © 2008 Ni et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Methodology Article Ni, Terri T Lemon, William J Shyr, Yu Zhong, Tao P Use of normalization methods for analysis of microarrays containing a high degree of gene effects |
title | Use of normalization methods for analysis of microarrays containing a high degree of gene effects |
title_full | Use of normalization methods for analysis of microarrays containing a high degree of gene effects |
title_fullStr | Use of normalization methods for analysis of microarrays containing a high degree of gene effects |
title_full_unstemmed | Use of normalization methods for analysis of microarrays containing a high degree of gene effects |
title_short | Use of normalization methods for analysis of microarrays containing a high degree of gene effects |
title_sort | use of normalization methods for analysis of microarrays containing a high degree of gene effects |
topic | Methodology Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2612699/ https://www.ncbi.nlm.nih.gov/pubmed/19040742 http://dx.doi.org/10.1186/1471-2105-9-505 |
work_keys_str_mv | AT niterrit useofnormalizationmethodsforanalysisofmicroarrayscontainingahighdegreeofgeneeffects AT lemonwilliamj useofnormalizationmethodsforanalysisofmicroarrayscontainingahighdegreeofgeneeffects AT shyryu useofnormalizationmethodsforanalysisofmicroarrayscontainingahighdegreeofgeneeffects AT zhongtaop useofnormalizationmethodsforanalysisofmicroarrayscontainingahighdegreeofgeneeffects |