Cargando…

Comparison of small n statistical tests of differential expression applied to microarrays

BACKGROUND: DNA microarrays provide data for genome wide patterns of expression between observation classes. Microarray studies often have small samples sizes, however, due to cost constraints or specimen availability. This can lead to poor random error estimates and inaccurate statistical tests of...

Descripción completa

Detalles Bibliográficos
Autores principales: Murie, Carl, Woody, Owen, Lee, Anna Y, Nadon, Robert
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2674054/
https://www.ncbi.nlm.nih.gov/pubmed/19192265
http://dx.doi.org/10.1186/1471-2105-10-45
_version_ 1782166614772285440
author Murie, Carl
Woody, Owen
Lee, Anna Y
Nadon, Robert
author_facet Murie, Carl
Woody, Owen
Lee, Anna Y
Nadon, Robert
author_sort Murie, Carl
collection PubMed
description BACKGROUND: DNA microarrays provide data for genome wide patterns of expression between observation classes. Microarray studies often have small samples sizes, however, due to cost constraints or specimen availability. This can lead to poor random error estimates and inaccurate statistical tests of differential expression. We compare the performance of the standard t-test, fold change, and four small n statistical test methods designed to circumvent these problems. We report results of various normalization methods for empirical microarray data and of various random error models for simulated data. RESULTS: Three Empirical Bayes methods (CyberT, BRB, and limma t-statistics) were the most effective statistical tests across simulated and both 2-colour cDNA and Affymetrix experimental data. The CyberT regularized t-statistic in particular was able to maintain expected false positive rates with simulated data showing high variances at low gene intensities, although at the cost of low true positive rates. The Local Pooled Error (LPE) test introduced a bias that lowered false positive rates below theoretically expected values and had lower power relative to the top performers. The standard two-sample t-test and fold change were also found to be sub-optimal for detecting differentially expressed genes. The generalized log transformation was shown to be beneficial in improving results with certain data sets, in particular high variance cDNA data. CONCLUSION: Pre-processing of data influences performance and the proper combination of pre-processing and statistical testing is necessary for obtaining the best results. All three Empirical Bayes methods assessed in our study are good choices for statistical tests for small n microarray studies for both Affymetrix and cDNA data. Choice of method for a particular study will depend on software and normalization preferences.
format Text
id pubmed-2674054
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-26740542009-04-28 Comparison of small n statistical tests of differential expression applied to microarrays Murie, Carl Woody, Owen Lee, Anna Y Nadon, Robert BMC Bioinformatics Methodology Article BACKGROUND: DNA microarrays provide data for genome wide patterns of expression between observation classes. Microarray studies often have small samples sizes, however, due to cost constraints or specimen availability. This can lead to poor random error estimates and inaccurate statistical tests of differential expression. We compare the performance of the standard t-test, fold change, and four small n statistical test methods designed to circumvent these problems. We report results of various normalization methods for empirical microarray data and of various random error models for simulated data. RESULTS: Three Empirical Bayes methods (CyberT, BRB, and limma t-statistics) were the most effective statistical tests across simulated and both 2-colour cDNA and Affymetrix experimental data. The CyberT regularized t-statistic in particular was able to maintain expected false positive rates with simulated data showing high variances at low gene intensities, although at the cost of low true positive rates. The Local Pooled Error (LPE) test introduced a bias that lowered false positive rates below theoretically expected values and had lower power relative to the top performers. The standard two-sample t-test and fold change were also found to be sub-optimal for detecting differentially expressed genes. The generalized log transformation was shown to be beneficial in improving results with certain data sets, in particular high variance cDNA data. CONCLUSION: Pre-processing of data influences performance and the proper combination of pre-processing and statistical testing is necessary for obtaining the best results. All three Empirical Bayes methods assessed in our study are good choices for statistical tests for small n microarray studies for both Affymetrix and cDNA data. Choice of method for a particular study will depend on software and normalization preferences. BioMed Central 2009-02-03 /pmc/articles/PMC2674054/ /pubmed/19192265 http://dx.doi.org/10.1186/1471-2105-10-45 Text en Copyright © 2009 Murie et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Murie, Carl
Woody, Owen
Lee, Anna Y
Nadon, Robert
Comparison of small n statistical tests of differential expression applied to microarrays
title Comparison of small n statistical tests of differential expression applied to microarrays
title_full Comparison of small n statistical tests of differential expression applied to microarrays
title_fullStr Comparison of small n statistical tests of differential expression applied to microarrays
title_full_unstemmed Comparison of small n statistical tests of differential expression applied to microarrays
title_short Comparison of small n statistical tests of differential expression applied to microarrays
title_sort comparison of small n statistical tests of differential expression applied to microarrays
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2674054/
https://www.ncbi.nlm.nih.gov/pubmed/19192265
http://dx.doi.org/10.1186/1471-2105-10-45
work_keys_str_mv AT muriecarl comparisonofsmallnstatisticaltestsofdifferentialexpressionappliedtomicroarrays
AT woodyowen comparisonofsmallnstatisticaltestsofdifferentialexpressionappliedtomicroarrays
AT leeannay comparisonofsmallnstatisticaltestsofdifferentialexpressionappliedtomicroarrays
AT nadonrobert comparisonofsmallnstatisticaltestsofdifferentialexpressionappliedtomicroarrays