Cargando…

Direct integration of intensity-level data from Affymetrix and Illumina microarrays improves statistical power for robust reanalysis

BACKGROUND: Affymetrix GeneChips and Illumina BeadArrays are the most widely used commercial single channel gene expression microarrays. Public data repositories are an extremely valuable resource, providing array-derived gene expression measurements from many thousands of experiments. Unfortunately...

Descripción completa

Detalles Bibliográficos
Autores principales: Turnbull, Arran K, Kitchen, Robert R, Larionov, Alexey A, Renshaw, Lorna, Dixon, J Michael, Sims, Andrew H
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2012
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3443058/
https://www.ncbi.nlm.nih.gov/pubmed/22909195
http://dx.doi.org/10.1186/1755-8794-5-35
_version_ 1782243511832150016
author Turnbull, Arran K
Kitchen, Robert R
Larionov, Alexey A
Renshaw, Lorna
Dixon, J Michael
Sims, Andrew H
author_facet Turnbull, Arran K
Kitchen, Robert R
Larionov, Alexey A
Renshaw, Lorna
Dixon, J Michael
Sims, Andrew H
author_sort Turnbull, Arran K
collection PubMed
description BACKGROUND: Affymetrix GeneChips and Illumina BeadArrays are the most widely used commercial single channel gene expression microarrays. Public data repositories are an extremely valuable resource, providing array-derived gene expression measurements from many thousands of experiments. Unfortunately many of these studies are underpowered and it is desirable to improve power by combining data from more than one study; we sought to determine whether platform-specific bias precludes direct integration of probe intensity signals for combined reanalysis. RESULTS: Using Affymetrix and Illumina data from the microarray quality control project, from our own clinical samples, and from additional publicly available datasets we evaluated several approaches to directly integrate intensity level expression data from the two platforms. After mapping probe sequences to Ensembl genes we demonstrate that, ComBat and cross platform normalisation (XPN), significantly outperform mean-centering and distance-weighted discrimination (DWD) in terms of minimising inter-platform variance. In particular we observed that DWD, a popular method used in a number of previous studies, removed systematic bias at the expense of genuine biological variability, potentially reducing legitimate biological differences from integrated datasets. CONCLUSION: Normalised and batch-corrected intensity-level data from Affymetrix and Illumina microarrays can be directly combined to generate biologically meaningful results with improved statistical power for robust, integrated reanalysis.
format Online
Article
Text
id pubmed-3443058
institution National Center for Biotechnology Information
language English
publishDate 2012
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-34430582012-09-15 Direct integration of intensity-level data from Affymetrix and Illumina microarrays improves statistical power for robust reanalysis Turnbull, Arran K Kitchen, Robert R Larionov, Alexey A Renshaw, Lorna Dixon, J Michael Sims, Andrew H BMC Med Genomics Research Article BACKGROUND: Affymetrix GeneChips and Illumina BeadArrays are the most widely used commercial single channel gene expression microarrays. Public data repositories are an extremely valuable resource, providing array-derived gene expression measurements from many thousands of experiments. Unfortunately many of these studies are underpowered and it is desirable to improve power by combining data from more than one study; we sought to determine whether platform-specific bias precludes direct integration of probe intensity signals for combined reanalysis. RESULTS: Using Affymetrix and Illumina data from the microarray quality control project, from our own clinical samples, and from additional publicly available datasets we evaluated several approaches to directly integrate intensity level expression data from the two platforms. After mapping probe sequences to Ensembl genes we demonstrate that, ComBat and cross platform normalisation (XPN), significantly outperform mean-centering and distance-weighted discrimination (DWD) in terms of minimising inter-platform variance. In particular we observed that DWD, a popular method used in a number of previous studies, removed systematic bias at the expense of genuine biological variability, potentially reducing legitimate biological differences from integrated datasets. CONCLUSION: Normalised and batch-corrected intensity-level data from Affymetrix and Illumina microarrays can be directly combined to generate biologically meaningful results with improved statistical power for robust, integrated reanalysis. BioMed Central 2012-08-21 /pmc/articles/PMC3443058/ /pubmed/22909195 http://dx.doi.org/10.1186/1755-8794-5-35 Text en Copyright ©2012 Turnbull et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Turnbull, Arran K
Kitchen, Robert R
Larionov, Alexey A
Renshaw, Lorna
Dixon, J Michael
Sims, Andrew H
Direct integration of intensity-level data from Affymetrix and Illumina microarrays improves statistical power for robust reanalysis
title Direct integration of intensity-level data from Affymetrix and Illumina microarrays improves statistical power for robust reanalysis
title_full Direct integration of intensity-level data from Affymetrix and Illumina microarrays improves statistical power for robust reanalysis
title_fullStr Direct integration of intensity-level data from Affymetrix and Illumina microarrays improves statistical power for robust reanalysis
title_full_unstemmed Direct integration of intensity-level data from Affymetrix and Illumina microarrays improves statistical power for robust reanalysis
title_short Direct integration of intensity-level data from Affymetrix and Illumina microarrays improves statistical power for robust reanalysis
title_sort direct integration of intensity-level data from affymetrix and illumina microarrays improves statistical power for robust reanalysis
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3443058/
https://www.ncbi.nlm.nih.gov/pubmed/22909195
http://dx.doi.org/10.1186/1755-8794-5-35
work_keys_str_mv AT turnbullarrank directintegrationofintensityleveldatafromaffymetrixandilluminamicroarraysimprovesstatisticalpowerforrobustreanalysis
AT kitchenrobertr directintegrationofintensityleveldatafromaffymetrixandilluminamicroarraysimprovesstatisticalpowerforrobustreanalysis
AT larionovalexeya directintegrationofintensityleveldatafromaffymetrixandilluminamicroarraysimprovesstatisticalpowerforrobustreanalysis
AT renshawlorna directintegrationofintensityleveldatafromaffymetrixandilluminamicroarraysimprovesstatisticalpowerforrobustreanalysis
AT dixonjmichael directintegrationofintensityleveldatafromaffymetrixandilluminamicroarraysimprovesstatisticalpowerforrobustreanalysis
AT simsandrewh directintegrationofintensityleveldatafromaffymetrixandilluminamicroarraysimprovesstatisticalpowerforrobustreanalysis