Cargando…

Comparison of seven methods for producing Affymetrix expression scores based on False Discovery Rates in disease profiling data

BACKGROUND: A critical step in processing oligonucleotide microarray data is combining the information in multiple probes to produce a single number that best captures the expression level of a RNA transcript. Several systematic studies comparing multiple methods for array processing have used tight...

Descripción completa

Detalles Bibliográficos
Autores principales:	Shedden, Kerby, Chen, Wei, Kuick, Rork, Ghosh, Debashis, Macdonald, James, Cho, Kathleen R, Giordano, Thomas J, Gruber, Stephen B, Fearon, Eric R, Taylor, Jeremy MG, Hanash, Samir
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2005
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC550659/ https://www.ncbi.nlm.nih.gov/pubmed/15705192 http://dx.doi.org/10.1186/1471-2105-6-26

_version_	1782122453766504448
author	Shedden, Kerby Chen, Wei Kuick, Rork Ghosh, Debashis Macdonald, James Cho, Kathleen R Giordano, Thomas J Gruber, Stephen B Fearon, Eric R Taylor, Jeremy MG Hanash, Samir
author_facet	Shedden, Kerby Chen, Wei Kuick, Rork Ghosh, Debashis Macdonald, James Cho, Kathleen R Giordano, Thomas J Gruber, Stephen B Fearon, Eric R Taylor, Jeremy MG Hanash, Samir
author_sort	Shedden, Kerby
collection	PubMed
description	BACKGROUND: A critical step in processing oligonucleotide microarray data is combining the information in multiple probes to produce a single number that best captures the expression level of a RNA transcript. Several systematic studies comparing multiple methods for array processing have used tightly controlled calibration data sets as the basis for comparison. Here we compare performances for seven processing methods using two data sets originally collected for disease profiling studies. An emphasis is placed on understanding sensitivity for detecting differentially expressed genes in terms of two key statistical determinants: test statistic variability for non-differentially expressed genes, and test statistic size for truly differentially expressed genes. RESULTS: In the two data sets considered here, up to seven-fold variation across the processing methods was found in the number of genes detected at a given false discovery rate (FDR). The best performing methods called up to 90% of the same genes differentially expressed, had less variable test statistics under randomization, and had a greater number of large test statistics in the experimental data. Poor performance of one method was directly tied to a tendency to produce highly variable test statistic values under randomization. Based on an overall measure of performance, two of the seven methods (Dchip and a trimmed mean approach) are superior in the two data sets considered here. Two other methods (MAS5 and GCRMA-EB) are inferior, while results for the other three methods are mixed. CONCLUSIONS: Choice of processing method has a major impact on differential expression analysis of microarray data. Previously reported performance analyses using tightly controlled calibration data sets are not highly consistent with results reported here using data from human tissue samples. Performance of array processing methods in disease profiling and other realistic biological studies should be given greater consideration when comparing Affymetrix processing methods.
format	Text
id	pubmed-550659
institution	National Center for Biotechnology Information
language	English
publishDate	2005
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-5506592005-02-27 Comparison of seven methods for producing Affymetrix expression scores based on False Discovery Rates in disease profiling data Shedden, Kerby Chen, Wei Kuick, Rork Ghosh, Debashis Macdonald, James Cho, Kathleen R Giordano, Thomas J Gruber, Stephen B Fearon, Eric R Taylor, Jeremy MG Hanash, Samir BMC Bioinformatics Research Article BACKGROUND: A critical step in processing oligonucleotide microarray data is combining the information in multiple probes to produce a single number that best captures the expression level of a RNA transcript. Several systematic studies comparing multiple methods for array processing have used tightly controlled calibration data sets as the basis for comparison. Here we compare performances for seven processing methods using two data sets originally collected for disease profiling studies. An emphasis is placed on understanding sensitivity for detecting differentially expressed genes in terms of two key statistical determinants: test statistic variability for non-differentially expressed genes, and test statistic size for truly differentially expressed genes. RESULTS: In the two data sets considered here, up to seven-fold variation across the processing methods was found in the number of genes detected at a given false discovery rate (FDR). The best performing methods called up to 90% of the same genes differentially expressed, had less variable test statistics under randomization, and had a greater number of large test statistics in the experimental data. Poor performance of one method was directly tied to a tendency to produce highly variable test statistic values under randomization. Based on an overall measure of performance, two of the seven methods (Dchip and a trimmed mean approach) are superior in the two data sets considered here. Two other methods (MAS5 and GCRMA-EB) are inferior, while results for the other three methods are mixed. CONCLUSIONS: Choice of processing method has a major impact on differential expression analysis of microarray data. Previously reported performance analyses using tightly controlled calibration data sets are not highly consistent with results reported here using data from human tissue samples. Performance of array processing methods in disease profiling and other realistic biological studies should be given greater consideration when comparing Affymetrix processing methods. BioMed Central 2005-02-10 /pmc/articles/PMC550659/ /pubmed/15705192 http://dx.doi.org/10.1186/1471-2105-6-26 Text en Copyright © 2005 Shedden et al; licensee BioMed Central Ltd.
spellingShingle	Research Article Shedden, Kerby Chen, Wei Kuick, Rork Ghosh, Debashis Macdonald, James Cho, Kathleen R Giordano, Thomas J Gruber, Stephen B Fearon, Eric R Taylor, Jeremy MG Hanash, Samir Comparison of seven methods for producing Affymetrix expression scores based on False Discovery Rates in disease profiling data
title	Comparison of seven methods for producing Affymetrix expression scores based on False Discovery Rates in disease profiling data
title_full	Comparison of seven methods for producing Affymetrix expression scores based on False Discovery Rates in disease profiling data
title_fullStr	Comparison of seven methods for producing Affymetrix expression scores based on False Discovery Rates in disease profiling data
title_full_unstemmed	Comparison of seven methods for producing Affymetrix expression scores based on False Discovery Rates in disease profiling data
title_short	Comparison of seven methods for producing Affymetrix expression scores based on False Discovery Rates in disease profiling data
title_sort	comparison of seven methods for producing affymetrix expression scores based on false discovery rates in disease profiling data
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC550659/ https://www.ncbi.nlm.nih.gov/pubmed/15705192 http://dx.doi.org/10.1186/1471-2105-6-26
work_keys_str_mv	AT sheddenkerby comparisonofsevenmethodsforproducingaffymetrixexpressionscoresbasedonfalsediscoveryratesindiseaseprofilingdata AT chenwei comparisonofsevenmethodsforproducingaffymetrixexpressionscoresbasedonfalsediscoveryratesindiseaseprofilingdata AT kuickrork comparisonofsevenmethodsforproducingaffymetrixexpressionscoresbasedonfalsediscoveryratesindiseaseprofilingdata AT ghoshdebashis comparisonofsevenmethodsforproducingaffymetrixexpressionscoresbasedonfalsediscoveryratesindiseaseprofilingdata AT macdonaldjames comparisonofsevenmethodsforproducingaffymetrixexpressionscoresbasedonfalsediscoveryratesindiseaseprofilingdata AT chokathleenr comparisonofsevenmethodsforproducingaffymetrixexpressionscoresbasedonfalsediscoveryratesindiseaseprofilingdata AT giordanothomasj comparisonofsevenmethodsforproducingaffymetrixexpressionscoresbasedonfalsediscoveryratesindiseaseprofilingdata AT gruberstephenb comparisonofsevenmethodsforproducingaffymetrixexpressionscoresbasedonfalsediscoveryratesindiseaseprofilingdata AT fearonericr comparisonofsevenmethodsforproducingaffymetrixexpressionscoresbasedonfalsediscoveryratesindiseaseprofilingdata AT taylorjeremymg comparisonofsevenmethodsforproducingaffymetrixexpressionscoresbasedonfalsediscoveryratesindiseaseprofilingdata AT hanashsamir comparisonofsevenmethodsforproducingaffymetrixexpressionscoresbasedonfalsediscoveryratesindiseaseprofilingdata

Comparison of seven methods for producing Affymetrix expression scores based on False Discovery Rates in disease profiling data

Ejemplares similares