Cargando…

caCORRECT2: Improving the accuracy and reliability of microarray data in the presence of artifacts

BACKGROUND: In previous work, we reported the development of caCORRECT, a novel microarray quality control system built to identify and correct spatial artifacts commonly found on Affymetrix arrays. We have made recent improvements to caCORRECT, including the development of a model-based data-replac...

Descripción completa

Detalles Bibliográficos
Autores principales: Moffitt, Richard A, Yin-Goen, Qiqin, Stokes, Todd H, Parry, R Mitchell, Torrance, James H, Phan, John H, Young, Andrew N, Wang, May D
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3230913/
https://www.ncbi.nlm.nih.gov/pubmed/21957981
http://dx.doi.org/10.1186/1471-2105-12-383
_version_ 1782218105701793792
author Moffitt, Richard A
Yin-Goen, Qiqin
Stokes, Todd H
Parry, R Mitchell
Torrance, James H
Phan, John H
Young, Andrew N
Wang, May D
author_facet Moffitt, Richard A
Yin-Goen, Qiqin
Stokes, Todd H
Parry, R Mitchell
Torrance, James H
Phan, John H
Young, Andrew N
Wang, May D
author_sort Moffitt, Richard A
collection PubMed
description BACKGROUND: In previous work, we reported the development of caCORRECT, a novel microarray quality control system built to identify and correct spatial artifacts commonly found on Affymetrix arrays. We have made recent improvements to caCORRECT, including the development of a model-based data-replacement strategy and integration with typical microarray workflows via caCORRECT's web portal and caBIG grid services. In this report, we demonstrate that caCORRECT improves the reproducibility and reliability of experimental results across several common Affymetrix microarray platforms. caCORRECT represents an advance over state-of-art quality control methods such as Harshlighting, and acts to improve gene expression calculation techniques such as PLIER, RMA and MAS5.0, because it incorporates spatial information into outlier detection as well as outlier information into probe normalization. The ability of caCORRECT to recover accurate gene expressions from low quality probe intensity data is assessed using a combination of real and synthetic artifacts with PCR follow-up confirmation and the affycomp spike in data. The caCORRECT tool can be accessed at the website: http://cacorrect.bme.gatech.edu. RESULTS: We demonstrate that (1) caCORRECT's artifact-aware normalization avoids the undesirable global data warping that happens when any damaged chips are processed without caCORRECT; (2) When used upstream of RMA, PLIER, or MAS5.0, the data imputation of caCORRECT generally improves the accuracy of microarray gene expression in the presence of artifacts more than using Harshlighting or not using any quality control; (3) Biomarkers selected from artifactual microarray data which have undergone the quality control procedures of caCORRECT are more likely to be reliable, as shown by both spike in and PCR validation experiments. Finally, we present a case study of the use of caCORRECT to reliably identify biomarkers for renal cell carcinoma, yielding two diagnostic biomarkers with potential clinical utility, PRKAB1 and NNMT. CONCLUSIONS: caCORRECT is shown to improve the accuracy of gene expression, and the reproducibility of experimental results in clinical application. This study suggests that caCORRECT will be useful to clean up possible artifacts in new as well as archived microarray data.
format Online
Article
Text
id pubmed-3230913
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-32309132011-12-07 caCORRECT2: Improving the accuracy and reliability of microarray data in the presence of artifacts Moffitt, Richard A Yin-Goen, Qiqin Stokes, Todd H Parry, R Mitchell Torrance, James H Phan, John H Young, Andrew N Wang, May D BMC Bioinformatics Research Article BACKGROUND: In previous work, we reported the development of caCORRECT, a novel microarray quality control system built to identify and correct spatial artifacts commonly found on Affymetrix arrays. We have made recent improvements to caCORRECT, including the development of a model-based data-replacement strategy and integration with typical microarray workflows via caCORRECT's web portal and caBIG grid services. In this report, we demonstrate that caCORRECT improves the reproducibility and reliability of experimental results across several common Affymetrix microarray platforms. caCORRECT represents an advance over state-of-art quality control methods such as Harshlighting, and acts to improve gene expression calculation techniques such as PLIER, RMA and MAS5.0, because it incorporates spatial information into outlier detection as well as outlier information into probe normalization. The ability of caCORRECT to recover accurate gene expressions from low quality probe intensity data is assessed using a combination of real and synthetic artifacts with PCR follow-up confirmation and the affycomp spike in data. The caCORRECT tool can be accessed at the website: http://cacorrect.bme.gatech.edu. RESULTS: We demonstrate that (1) caCORRECT's artifact-aware normalization avoids the undesirable global data warping that happens when any damaged chips are processed without caCORRECT; (2) When used upstream of RMA, PLIER, or MAS5.0, the data imputation of caCORRECT generally improves the accuracy of microarray gene expression in the presence of artifacts more than using Harshlighting or not using any quality control; (3) Biomarkers selected from artifactual microarray data which have undergone the quality control procedures of caCORRECT are more likely to be reliable, as shown by both spike in and PCR validation experiments. Finally, we present a case study of the use of caCORRECT to reliably identify biomarkers for renal cell carcinoma, yielding two diagnostic biomarkers with potential clinical utility, PRKAB1 and NNMT. CONCLUSIONS: caCORRECT is shown to improve the accuracy of gene expression, and the reproducibility of experimental results in clinical application. This study suggests that caCORRECT will be useful to clean up possible artifacts in new as well as archived microarray data. BioMed Central 2011-09-29 /pmc/articles/PMC3230913/ /pubmed/21957981 http://dx.doi.org/10.1186/1471-2105-12-383 Text en Copyright ©2011 Moffitt et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Moffitt, Richard A
Yin-Goen, Qiqin
Stokes, Todd H
Parry, R Mitchell
Torrance, James H
Phan, John H
Young, Andrew N
Wang, May D
caCORRECT2: Improving the accuracy and reliability of microarray data in the presence of artifacts
title caCORRECT2: Improving the accuracy and reliability of microarray data in the presence of artifacts
title_full caCORRECT2: Improving the accuracy and reliability of microarray data in the presence of artifacts
title_fullStr caCORRECT2: Improving the accuracy and reliability of microarray data in the presence of artifacts
title_full_unstemmed caCORRECT2: Improving the accuracy and reliability of microarray data in the presence of artifacts
title_short caCORRECT2: Improving the accuracy and reliability of microarray data in the presence of artifacts
title_sort cacorrect2: improving the accuracy and reliability of microarray data in the presence of artifacts
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3230913/
https://www.ncbi.nlm.nih.gov/pubmed/21957981
http://dx.doi.org/10.1186/1471-2105-12-383
work_keys_str_mv AT moffittricharda cacorrect2improvingtheaccuracyandreliabilityofmicroarraydatainthepresenceofartifacts
AT yingoenqiqin cacorrect2improvingtheaccuracyandreliabilityofmicroarraydatainthepresenceofartifacts
AT stokestoddh cacorrect2improvingtheaccuracyandreliabilityofmicroarraydatainthepresenceofartifacts
AT parryrmitchell cacorrect2improvingtheaccuracyandreliabilityofmicroarraydatainthepresenceofartifacts
AT torrancejamesh cacorrect2improvingtheaccuracyandreliabilityofmicroarraydatainthepresenceofartifacts
AT phanjohnh cacorrect2improvingtheaccuracyandreliabilityofmicroarraydatainthepresenceofartifacts
AT youngandrewn cacorrect2improvingtheaccuracyandreliabilityofmicroarraydatainthepresenceofartifacts
AT wangmayd cacorrect2improvingtheaccuracyandreliabilityofmicroarraydatainthepresenceofartifacts