Cargando…

Reproducibility of microarray data: a further analysis of microarray quality control (MAQC) data

BACKGROUND: Many researchers are concerned with the comparability and reliability of microarray gene expression data. Recent completion of the MicroArray Quality Control (MAQC) project provides a unique opportunity to assess reproducibility across multiple sites and the comparability across multiple...

Descripción completa

Detalles Bibliográficos
Autores principales: Chen, James J, Hsueh, Huey-Miin, Delongchamp, Robert R, Lin, Chien-Ju, Tsai, Chen-An
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2204045/
https://www.ncbi.nlm.nih.gov/pubmed/17961233
http://dx.doi.org/10.1186/1471-2105-8-412
_version_ 1782148421397774336
author Chen, James J
Hsueh, Huey-Miin
Delongchamp, Robert R
Lin, Chien-Ju
Tsai, Chen-An
author_facet Chen, James J
Hsueh, Huey-Miin
Delongchamp, Robert R
Lin, Chien-Ju
Tsai, Chen-An
author_sort Chen, James J
collection PubMed
description BACKGROUND: Many researchers are concerned with the comparability and reliability of microarray gene expression data. Recent completion of the MicroArray Quality Control (MAQC) project provides a unique opportunity to assess reproducibility across multiple sites and the comparability across multiple platforms. The MAQC analysis presented for the conclusion of inter- and intra-platform comparability/reproducibility of microarray gene expression measurements is inadequate. We evaluate the reproducibility/comparability of the MAQC data for 12901 common genes in four titration samples generated from five high-density one-color microarray platforms and the TaqMan technology. We discuss some of the problems with the use of correlation coefficient as metric to evaluate the inter- and intra-platform reproducibility and the percent of overlapping genes (POG) as a measure for evaluation of a gene selection procedure by MAQC. RESULTS: A total of 293 arrays were used in the intra- and inter-platform analysis. A hierarchical cluster analysis shows distinct differences in the measured intensities among the five platforms. A number of genes show a small fold-change in one platform and a large fold-change in another platform, even though the correlations between platforms are high. An analysis of variance shows thirty percent of gene expressions of the samples show inconsistent patterns across the five platforms. We illustrated that POG does not reflect the accuracy of a selected gene list. A non-overlapping gene can be truly differentially expressed with a stringent cut, and an overlapping gene can be non-differentially expressed with non-stringent cutoff. In addition, POG is an unusable selection criterion. POG can increase or decrease irregularly as cutoff changes; there is no criterion to determine a cutoff so that POG is optimized. CONCLUSION: Using various statistical methods we demonstrate that there are differences in the intensities measured by different platforms and different sites within platform. Within each platform, the patterns of expression are generally consistent, but there is site-by-site variability. Evaluation of data analysis methods for use in regulatory decision should take no treatment effect into consideration, when there is no treatment effect, "a fold-change cutoff with a non-stringent p-value cutoff" could result in 100% false positive error selection.
format Text
id pubmed-2204045
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-22040452008-01-29 Reproducibility of microarray data: a further analysis of microarray quality control (MAQC) data Chen, James J Hsueh, Huey-Miin Delongchamp, Robert R Lin, Chien-Ju Tsai, Chen-An BMC Bioinformatics Correspondence BACKGROUND: Many researchers are concerned with the comparability and reliability of microarray gene expression data. Recent completion of the MicroArray Quality Control (MAQC) project provides a unique opportunity to assess reproducibility across multiple sites and the comparability across multiple platforms. The MAQC analysis presented for the conclusion of inter- and intra-platform comparability/reproducibility of microarray gene expression measurements is inadequate. We evaluate the reproducibility/comparability of the MAQC data for 12901 common genes in four titration samples generated from five high-density one-color microarray platforms and the TaqMan technology. We discuss some of the problems with the use of correlation coefficient as metric to evaluate the inter- and intra-platform reproducibility and the percent of overlapping genes (POG) as a measure for evaluation of a gene selection procedure by MAQC. RESULTS: A total of 293 arrays were used in the intra- and inter-platform analysis. A hierarchical cluster analysis shows distinct differences in the measured intensities among the five platforms. A number of genes show a small fold-change in one platform and a large fold-change in another platform, even though the correlations between platforms are high. An analysis of variance shows thirty percent of gene expressions of the samples show inconsistent patterns across the five platforms. We illustrated that POG does not reflect the accuracy of a selected gene list. A non-overlapping gene can be truly differentially expressed with a stringent cut, and an overlapping gene can be non-differentially expressed with non-stringent cutoff. In addition, POG is an unusable selection criterion. POG can increase or decrease irregularly as cutoff changes; there is no criterion to determine a cutoff so that POG is optimized. CONCLUSION: Using various statistical methods we demonstrate that there are differences in the intensities measured by different platforms and different sites within platform. Within each platform, the patterns of expression are generally consistent, but there is site-by-site variability. Evaluation of data analysis methods for use in regulatory decision should take no treatment effect into consideration, when there is no treatment effect, "a fold-change cutoff with a non-stringent p-value cutoff" could result in 100% false positive error selection. BioMed Central 2007-10-25 /pmc/articles/PMC2204045/ /pubmed/17961233 http://dx.doi.org/10.1186/1471-2105-8-412 Text en Copyright © 2007 Chen et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Correspondence
Chen, James J
Hsueh, Huey-Miin
Delongchamp, Robert R
Lin, Chien-Ju
Tsai, Chen-An
Reproducibility of microarray data: a further analysis of microarray quality control (MAQC) data
title Reproducibility of microarray data: a further analysis of microarray quality control (MAQC) data
title_full Reproducibility of microarray data: a further analysis of microarray quality control (MAQC) data
title_fullStr Reproducibility of microarray data: a further analysis of microarray quality control (MAQC) data
title_full_unstemmed Reproducibility of microarray data: a further analysis of microarray quality control (MAQC) data
title_short Reproducibility of microarray data: a further analysis of microarray quality control (MAQC) data
title_sort reproducibility of microarray data: a further analysis of microarray quality control (maqc) data
topic Correspondence
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2204045/
https://www.ncbi.nlm.nih.gov/pubmed/17961233
http://dx.doi.org/10.1186/1471-2105-8-412
work_keys_str_mv AT chenjamesj reproducibilityofmicroarraydataafurtheranalysisofmicroarrayqualitycontrolmaqcdata
AT hsuehhueymiin reproducibilityofmicroarraydataafurtheranalysisofmicroarrayqualitycontrolmaqcdata
AT delongchamprobertr reproducibilityofmicroarraydataafurtheranalysisofmicroarrayqualitycontrolmaqcdata
AT linchienju reproducibilityofmicroarraydataafurtheranalysisofmicroarrayqualitycontrolmaqcdata
AT tsaichenan reproducibilityofmicroarraydataafurtheranalysisofmicroarrayqualitycontrolmaqcdata