Cargando…

Revisiting adverse effects of cross-hybridization in Affymetrix gene expression data: do they matter for correlation analysis?

BACKGROUND. This work was undertaken in response to a recently published paper by Okoniewski and Miller (BMC Bioinformatics 2006, 7: Article 276). The authors of that paper came to the conclusion that the process of multiple targeting in short oligonucleotide microarrays induces spurious correlation...

Descripción completa

Detalles Bibliográficos
Autores principales: Klebanov, Lev, Chen, Linlin, Yakovlev, Andrei
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2211459/
https://www.ncbi.nlm.nih.gov/pubmed/17988401
http://dx.doi.org/10.1186/1745-6150-2-28
_version_ 1782148520307851264
author Klebanov, Lev
Chen, Linlin
Yakovlev, Andrei
author_facet Klebanov, Lev
Chen, Linlin
Yakovlev, Andrei
author_sort Klebanov, Lev
collection PubMed
description BACKGROUND. This work was undertaken in response to a recently published paper by Okoniewski and Miller (BMC Bioinformatics 2006, 7: Article 276). The authors of that paper came to the conclusion that the process of multiple targeting in short oligonucleotide microarrays induces spurious correlations and this effect may deteriorate the inference on correlation coefficients. The design of their study and supporting simulations cast serious doubt upon the validity of this conclusion. The work by Okoniewski and Miller drove us to revisit the issue by means of experimentation with biological data and probabilistic modeling of cross-hybridization effects. RESULTS. We have identified two serious flaws in the study by Okoniewski and Miller: (1) The data used in their paper are not amenable to correlation analysis; (2) The proposed simulation model is inadequate for studying the effects of cross-hybridization. Using two other data sets, we have shown that removing multiply targeted probe sets does not lead to a shift in the histogram of sample correlation coefficients towards smaller values. A more realistic approach to mathematical modeling of cross-hybridization demonstrates that this process is by far more complex than the simplistic model considered by the authors. A diversity of correlation effects (such as the induction of positive or negative correlations) caused by cross-hybridization can be expected in theory but there are natural limitations on the ability to provide quantitative insights into such effects due to the fact that they are not directly observable. CONCLUSION. The proposed stochastic model is instrumental in studying general regularities in hybridization interaction between probe sets in microarray data. As the problem stands now, there is no compelling reason to believe that multiple targeting causes a large-scale effect on the correlation structure of Affymetrix gene expression data. Our analysis suggests that the observed long-range correlations in microarray data are of a biological nature rather than a technological flaw. REVIEWERS: The paper was reviewed by I. K. Jordan, D. P. Gaile (nominated by E. Koonin), and W. Huber (nominated by S. Dudoit).
format Text
id pubmed-2211459
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-22114592008-01-23 Revisiting adverse effects of cross-hybridization in Affymetrix gene expression data: do they matter for correlation analysis? Klebanov, Lev Chen, Linlin Yakovlev, Andrei Biol Direct Research BACKGROUND. This work was undertaken in response to a recently published paper by Okoniewski and Miller (BMC Bioinformatics 2006, 7: Article 276). The authors of that paper came to the conclusion that the process of multiple targeting in short oligonucleotide microarrays induces spurious correlations and this effect may deteriorate the inference on correlation coefficients. The design of their study and supporting simulations cast serious doubt upon the validity of this conclusion. The work by Okoniewski and Miller drove us to revisit the issue by means of experimentation with biological data and probabilistic modeling of cross-hybridization effects. RESULTS. We have identified two serious flaws in the study by Okoniewski and Miller: (1) The data used in their paper are not amenable to correlation analysis; (2) The proposed simulation model is inadequate for studying the effects of cross-hybridization. Using two other data sets, we have shown that removing multiply targeted probe sets does not lead to a shift in the histogram of sample correlation coefficients towards smaller values. A more realistic approach to mathematical modeling of cross-hybridization demonstrates that this process is by far more complex than the simplistic model considered by the authors. A diversity of correlation effects (such as the induction of positive or negative correlations) caused by cross-hybridization can be expected in theory but there are natural limitations on the ability to provide quantitative insights into such effects due to the fact that they are not directly observable. CONCLUSION. The proposed stochastic model is instrumental in studying general regularities in hybridization interaction between probe sets in microarray data. As the problem stands now, there is no compelling reason to believe that multiple targeting causes a large-scale effect on the correlation structure of Affymetrix gene expression data. Our analysis suggests that the observed long-range correlations in microarray data are of a biological nature rather than a technological flaw. REVIEWERS: The paper was reviewed by I. K. Jordan, D. P. Gaile (nominated by E. Koonin), and W. Huber (nominated by S. Dudoit). BioMed Central 2007-11-07 /pmc/articles/PMC2211459/ /pubmed/17988401 http://dx.doi.org/10.1186/1745-6150-2-28 Text en Copyright © 2007 Klebanov et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Klebanov, Lev
Chen, Linlin
Yakovlev, Andrei
Revisiting adverse effects of cross-hybridization in Affymetrix gene expression data: do they matter for correlation analysis?
title Revisiting adverse effects of cross-hybridization in Affymetrix gene expression data: do they matter for correlation analysis?
title_full Revisiting adverse effects of cross-hybridization in Affymetrix gene expression data: do they matter for correlation analysis?
title_fullStr Revisiting adverse effects of cross-hybridization in Affymetrix gene expression data: do they matter for correlation analysis?
title_full_unstemmed Revisiting adverse effects of cross-hybridization in Affymetrix gene expression data: do they matter for correlation analysis?
title_short Revisiting adverse effects of cross-hybridization in Affymetrix gene expression data: do they matter for correlation analysis?
title_sort revisiting adverse effects of cross-hybridization in affymetrix gene expression data: do they matter for correlation analysis?
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2211459/
https://www.ncbi.nlm.nih.gov/pubmed/17988401
http://dx.doi.org/10.1186/1745-6150-2-28
work_keys_str_mv AT klebanovlev revisitingadverseeffectsofcrosshybridizationinaffymetrixgeneexpressiondatadotheymatterforcorrelationanalysis
AT chenlinlin revisitingadverseeffectsofcrosshybridizationinaffymetrixgeneexpressiondatadotheymatterforcorrelationanalysis
AT yakovlevandrei revisitingadverseeffectsofcrosshybridizationinaffymetrixgeneexpressiondatadotheymatterforcorrelationanalysis