Cargando…

The effects of normalization on the correlation structure of microarray data

BACKGROUND: Stochastic dependence between gene expression levels in microarray data is of critical importance for the methods of statistical inference that resort to pooling test-statistics across genes. It is frequently assumed that dependence between genes (or tests) is suffciently weak to justify...

Descripción completa

Detalles Bibliográficos
Autores principales: Qiu, Xing, Brooks, Andrew I, Klebanov, Lev, Yakovlev, Andrei
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2005
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1156869/
https://www.ncbi.nlm.nih.gov/pubmed/15904488
http://dx.doi.org/10.1186/1471-2105-6-120
_version_ 1782124322003877888
author Qiu, Xing
Brooks, Andrew I
Klebanov, Lev
Yakovlev, Andrei
author_facet Qiu, Xing
Brooks, Andrew I
Klebanov, Lev
Yakovlev, Andrei
author_sort Qiu, Xing
collection PubMed
description BACKGROUND: Stochastic dependence between gene expression levels in microarray data is of critical importance for the methods of statistical inference that resort to pooling test-statistics across genes. It is frequently assumed that dependence between genes (or tests) is suffciently weak to justify the proposed methods of testing for differentially expressed genes. A potential impact of between-gene correlations on the performance of such methods has yet to be explored. RESULTS: The paper presents a systematic study of correlation between the t-statistics associated with different genes. We report the effects of four different normalization methods using a large set of microarray data on childhood leukemia in addition to several sets of simulated data. Our findings help decipher the correlation structure of microarray data before and after the application of normalization procedures. CONCLUSION: A long-range correlation in microarray data manifests itself in thousands of genes that are heavily correlated with a given gene in terms of the associated t-statistics. By using normalization methods it is possible to significantly reduce correlation between the t-statistics computed for different genes. Normalization procedures affect both the true correlation, stemming from gene interactions, and the spurious correlation induced by random noise. When analyzing real world biological data sets, normalization procedures are unable to completely remove correlation between the test statistics. The long-range correlation structure also persists in normalized data.
format Text
id pubmed-1156869
institution National Center for Biotechnology Information
language English
publishDate 2005
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-11568692005-06-22 The effects of normalization on the correlation structure of microarray data Qiu, Xing Brooks, Andrew I Klebanov, Lev Yakovlev, Andrei BMC Bioinformatics Methodology Article BACKGROUND: Stochastic dependence between gene expression levels in microarray data is of critical importance for the methods of statistical inference that resort to pooling test-statistics across genes. It is frequently assumed that dependence between genes (or tests) is suffciently weak to justify the proposed methods of testing for differentially expressed genes. A potential impact of between-gene correlations on the performance of such methods has yet to be explored. RESULTS: The paper presents a systematic study of correlation between the t-statistics associated with different genes. We report the effects of four different normalization methods using a large set of microarray data on childhood leukemia in addition to several sets of simulated data. Our findings help decipher the correlation structure of microarray data before and after the application of normalization procedures. CONCLUSION: A long-range correlation in microarray data manifests itself in thousands of genes that are heavily correlated with a given gene in terms of the associated t-statistics. By using normalization methods it is possible to significantly reduce correlation between the t-statistics computed for different genes. Normalization procedures affect both the true correlation, stemming from gene interactions, and the spurious correlation induced by random noise. When analyzing real world biological data sets, normalization procedures are unable to completely remove correlation between the test statistics. The long-range correlation structure also persists in normalized data. BioMed Central 2005-05-16 /pmc/articles/PMC1156869/ /pubmed/15904488 http://dx.doi.org/10.1186/1471-2105-6-120 Text en Copyright © 2005 Qiu et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Qiu, Xing
Brooks, Andrew I
Klebanov, Lev
Yakovlev, Andrei
The effects of normalization on the correlation structure of microarray data
title The effects of normalization on the correlation structure of microarray data
title_full The effects of normalization on the correlation structure of microarray data
title_fullStr The effects of normalization on the correlation structure of microarray data
title_full_unstemmed The effects of normalization on the correlation structure of microarray data
title_short The effects of normalization on the correlation structure of microarray data
title_sort effects of normalization on the correlation structure of microarray data
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1156869/
https://www.ncbi.nlm.nih.gov/pubmed/15904488
http://dx.doi.org/10.1186/1471-2105-6-120
work_keys_str_mv AT qiuxing theeffectsofnormalizationonthecorrelationstructureofmicroarraydata
AT brooksandrewi theeffectsofnormalizationonthecorrelationstructureofmicroarraydata
AT klebanovlev theeffectsofnormalizationonthecorrelationstructureofmicroarraydata
AT yakovlevandrei theeffectsofnormalizationonthecorrelationstructureofmicroarraydata
AT qiuxing effectsofnormalizationonthecorrelationstructureofmicroarraydata
AT brooksandrewi effectsofnormalizationonthecorrelationstructureofmicroarraydata
AT klebanovlev effectsofnormalizationonthecorrelationstructureofmicroarraydata
AT yakovlevandrei effectsofnormalizationonthecorrelationstructureofmicroarraydata