Cargando…

Empirical comparison of cross-platform normalization methods for gene expression data

BACKGROUND: Simultaneous measurement of gene expression on a genomic scale can be accomplished using microarray technology or by sequencing based methods. Researchers who perform high throughput gene expression assays often deposit their data in public databases, but heterogeneity of measurement pla...

Descripción completa

Detalles Bibliográficos
Autores principales: Rudy, Jason, Valafar, Faramarz
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3314675/
https://www.ncbi.nlm.nih.gov/pubmed/22151536
http://dx.doi.org/10.1186/1471-2105-12-467
_version_ 1782228126976180224
author Rudy, Jason
Valafar, Faramarz
author_facet Rudy, Jason
Valafar, Faramarz
author_sort Rudy, Jason
collection PubMed
description BACKGROUND: Simultaneous measurement of gene expression on a genomic scale can be accomplished using microarray technology or by sequencing based methods. Researchers who perform high throughput gene expression assays often deposit their data in public databases, but heterogeneity of measurement platforms leads to challenges for the combination and comparison of data sets. Researchers wishing to perform cross platform normalization face two major obstacles. First, a choice must be made about which method or methods to employ. Nine are currently available, and no rigorous comparison exists. Second, software for the selected method must be obtained and incorporated into a data analysis workflow. RESULTS: Using two publicly available cross-platform testing data sets, cross-platform normalization methods are compared based on inter-platform concordance and on the consistency of gene lists obtained with transformed data. Scatter and ROC-like plots are produced and new statistics based on those plots are introduced to measure the effectiveness of each method. Bootstrapping is employed to obtain distributions for those statistics. The consistency of platform effects across studies is explored theoretically and with respect to the testing data sets. CONCLUSIONS: Our comparisons indicate that four methods, DWD, EB, GQ, and XPN, are generally effective, while the remaining methods do not adequately correct for platform effects. Of the four successful methods, XPN generally shows the highest inter-platform concordance when treatment groups are equally sized, while DWD is most robust to differently sized treatment groups and consistently shows the smallest loss in gene detection. We provide an R package, CONOR, capable of performing the nine cross-platform normalization methods considered. The package can be downloaded at http://alborz.sdsu.edu/conor and is available from CRAN.
format Online
Article
Text
id pubmed-3314675
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-33146752012-04-02 Empirical comparison of cross-platform normalization methods for gene expression data Rudy, Jason Valafar, Faramarz BMC Bioinformatics Research Article BACKGROUND: Simultaneous measurement of gene expression on a genomic scale can be accomplished using microarray technology or by sequencing based methods. Researchers who perform high throughput gene expression assays often deposit their data in public databases, but heterogeneity of measurement platforms leads to challenges for the combination and comparison of data sets. Researchers wishing to perform cross platform normalization face two major obstacles. First, a choice must be made about which method or methods to employ. Nine are currently available, and no rigorous comparison exists. Second, software for the selected method must be obtained and incorporated into a data analysis workflow. RESULTS: Using two publicly available cross-platform testing data sets, cross-platform normalization methods are compared based on inter-platform concordance and on the consistency of gene lists obtained with transformed data. Scatter and ROC-like plots are produced and new statistics based on those plots are introduced to measure the effectiveness of each method. Bootstrapping is employed to obtain distributions for those statistics. The consistency of platform effects across studies is explored theoretically and with respect to the testing data sets. CONCLUSIONS: Our comparisons indicate that four methods, DWD, EB, GQ, and XPN, are generally effective, while the remaining methods do not adequately correct for platform effects. Of the four successful methods, XPN generally shows the highest inter-platform concordance when treatment groups are equally sized, while DWD is most robust to differently sized treatment groups and consistently shows the smallest loss in gene detection. We provide an R package, CONOR, capable of performing the nine cross-platform normalization methods considered. The package can be downloaded at http://alborz.sdsu.edu/conor and is available from CRAN. BioMed Central 2011-12-07 /pmc/articles/PMC3314675/ /pubmed/22151536 http://dx.doi.org/10.1186/1471-2105-12-467 Text en Copyright ©2011 Rudy and Valafar; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Rudy, Jason
Valafar, Faramarz
Empirical comparison of cross-platform normalization methods for gene expression data
title Empirical comparison of cross-platform normalization methods for gene expression data
title_full Empirical comparison of cross-platform normalization methods for gene expression data
title_fullStr Empirical comparison of cross-platform normalization methods for gene expression data
title_full_unstemmed Empirical comparison of cross-platform normalization methods for gene expression data
title_short Empirical comparison of cross-platform normalization methods for gene expression data
title_sort empirical comparison of cross-platform normalization methods for gene expression data
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3314675/
https://www.ncbi.nlm.nih.gov/pubmed/22151536
http://dx.doi.org/10.1186/1471-2105-12-467
work_keys_str_mv AT rudyjason empiricalcomparisonofcrossplatformnormalizationmethodsforgeneexpressiondata
AT valafarfaramarz empiricalcomparisonofcrossplatformnormalizationmethodsforgeneexpressiondata