Cargando…

Comprehensive analysis of correlation coefficients estimated from pooling heterogeneous microarray data

BACKGROUND: The synthesis of information across microarray studies has been performed by combining statistical results of individual studies (as in a mosaic), or by combining data from multiple studies into a large pool to be analyzed as a single data set (as in a melting pot of data). Specific issu...

Descripción completa

Detalles Bibliográficos
Autores principales: Almeida-de-Macedo, Márcia M, Ransom, Nick, Feng, Yaping, Hurst, Jonathan, Wurtele, Eve Syrkin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3765419/
https://www.ncbi.nlm.nih.gov/pubmed/23822712
http://dx.doi.org/10.1186/1471-2105-14-214
_version_ 1782283304885551104
author Almeida-de-Macedo, Márcia M
Ransom, Nick
Feng, Yaping
Hurst, Jonathan
Wurtele, Eve Syrkin
author_facet Almeida-de-Macedo, Márcia M
Ransom, Nick
Feng, Yaping
Hurst, Jonathan
Wurtele, Eve Syrkin
author_sort Almeida-de-Macedo, Márcia M
collection PubMed
description BACKGROUND: The synthesis of information across microarray studies has been performed by combining statistical results of individual studies (as in a mosaic), or by combining data from multiple studies into a large pool to be analyzed as a single data set (as in a melting pot of data). Specific issues relating to data heterogeneity across microarray studies, such as differences within and between labs or differences among experimental conditions, could lead to equivocal results in a melting pot approach. RESULTS: We applied statistical theory to determine the specific effect of different means and heteroskedasticity across 19 groups of microarray data on the sign and magnitude of gene-to-gene Pearson correlation coefficients obtained from the pool of 19 groups. We quantified the biases of the pooled coefficients and compared them to the biases of correlations estimated by an effect-size model. Mean differences across the 19 groups were the main factor determining the magnitude and sign of the pooled coefficients, which showed largest values of bias as they approached ±1. Only heteroskedasticity across the pool of 19 groups resulted in less efficient estimations of correlations than did a classical meta-analysis approach of combining correlation coefficients. These results were corroborated by simulation studies involving either mean differences or heteroskedasticity across a pool of N > 2 groups. CONCLUSIONS: The combination of statistical results is best suited for synthesizing the correlation between expression profiles of a gene pair across several microarray studies.
format Online
Article
Text
id pubmed-3765419
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-37654192013-09-10 Comprehensive analysis of correlation coefficients estimated from pooling heterogeneous microarray data Almeida-de-Macedo, Márcia M Ransom, Nick Feng, Yaping Hurst, Jonathan Wurtele, Eve Syrkin BMC Bioinformatics Research Article BACKGROUND: The synthesis of information across microarray studies has been performed by combining statistical results of individual studies (as in a mosaic), or by combining data from multiple studies into a large pool to be analyzed as a single data set (as in a melting pot of data). Specific issues relating to data heterogeneity across microarray studies, such as differences within and between labs or differences among experimental conditions, could lead to equivocal results in a melting pot approach. RESULTS: We applied statistical theory to determine the specific effect of different means and heteroskedasticity across 19 groups of microarray data on the sign and magnitude of gene-to-gene Pearson correlation coefficients obtained from the pool of 19 groups. We quantified the biases of the pooled coefficients and compared them to the biases of correlations estimated by an effect-size model. Mean differences across the 19 groups were the main factor determining the magnitude and sign of the pooled coefficients, which showed largest values of bias as they approached ±1. Only heteroskedasticity across the pool of 19 groups resulted in less efficient estimations of correlations than did a classical meta-analysis approach of combining correlation coefficients. These results were corroborated by simulation studies involving either mean differences or heteroskedasticity across a pool of N > 2 groups. CONCLUSIONS: The combination of statistical results is best suited for synthesizing the correlation between expression profiles of a gene pair across several microarray studies. BioMed Central 2013-07-04 /pmc/articles/PMC3765419/ /pubmed/23822712 http://dx.doi.org/10.1186/1471-2105-14-214 Text en Copyright © 2013 Almeida-de-Macedo et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Almeida-de-Macedo, Márcia M
Ransom, Nick
Feng, Yaping
Hurst, Jonathan
Wurtele, Eve Syrkin
Comprehensive analysis of correlation coefficients estimated from pooling heterogeneous microarray data
title Comprehensive analysis of correlation coefficients estimated from pooling heterogeneous microarray data
title_full Comprehensive analysis of correlation coefficients estimated from pooling heterogeneous microarray data
title_fullStr Comprehensive analysis of correlation coefficients estimated from pooling heterogeneous microarray data
title_full_unstemmed Comprehensive analysis of correlation coefficients estimated from pooling heterogeneous microarray data
title_short Comprehensive analysis of correlation coefficients estimated from pooling heterogeneous microarray data
title_sort comprehensive analysis of correlation coefficients estimated from pooling heterogeneous microarray data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3765419/
https://www.ncbi.nlm.nih.gov/pubmed/23822712
http://dx.doi.org/10.1186/1471-2105-14-214
work_keys_str_mv AT almeidademacedomarciam comprehensiveanalysisofcorrelationcoefficientsestimatedfrompoolingheterogeneousmicroarraydata
AT ransomnick comprehensiveanalysisofcorrelationcoefficientsestimatedfrompoolingheterogeneousmicroarraydata
AT fengyaping comprehensiveanalysisofcorrelationcoefficientsestimatedfrompoolingheterogeneousmicroarraydata
AT hurstjonathan comprehensiveanalysisofcorrelationcoefficientsestimatedfrompoolingheterogeneousmicroarraydata
AT wurteleevesyrkin comprehensiveanalysisofcorrelationcoefficientsestimatedfrompoolingheterogeneousmicroarraydata