Cargando…
The statistics of identifying differentially expressed genes in Expresso and TM4: a comparison
BACKGROUND: Analysis of DNA microarray data takes as input spot intensity measurements from scanner software and returns differential expression of genes between two conditions, together with a statistical significance assessment. This process typically consists of two steps: data normalization and...
Autores principales: | , , , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2006
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1513403/ https://www.ncbi.nlm.nih.gov/pubmed/16626497 http://dx.doi.org/10.1186/1471-2105-7-215 |
_version_ | 1782128493444726784 |
---|---|
author | Sioson, Allan A Mane, Shrinivasrao P Li, Pinghua Sha, Wei Heath, Lenwood S Bohnert, Hans J Grene, Ruth |
author_facet | Sioson, Allan A Mane, Shrinivasrao P Li, Pinghua Sha, Wei Heath, Lenwood S Bohnert, Hans J Grene, Ruth |
author_sort | Sioson, Allan A |
collection | PubMed |
description | BACKGROUND: Analysis of DNA microarray data takes as input spot intensity measurements from scanner software and returns differential expression of genes between two conditions, together with a statistical significance assessment. This process typically consists of two steps: data normalization and identification of differentially expressed genes through statistical analysis. The Expresso microarray experiment management system implements these steps with a two-stage, log-linear ANOVA mixed model technique, tailored to individual experimental designs. The complement of tools in TM4, on the other hand, is based on a number of preset design choices that limit its flexibility. In the TM4 microarray analysis suite, normalization, filter, and analysis methods form an analysis pipeline. TM4 computes integrated intensity values (IIV) from the average intensities and spot pixel counts returned by the scanner software as input to its normalization steps. By contrast, Expresso can use either IIV data or median intensity values (MIV). Here, we compare Expresso and TM4 analysis of two experiments and assess the results against qRT-PCR data. RESULTS: The Expresso analysis using MIV data consistently identifies more genes as differentially expressed, when compared to Expresso analysis with IIV data. The typical TM4 normalization and filtering pipeline corrects systematic intensity-specific bias on a per microarray basis. Subsequent statistical analysis with Expresso or a TM4 t-test can effectively identify differentially expressed genes. The best agreement with qRT-PCR data is obtained through the use of Expresso analysis and MIV data. CONCLUSION: The results of this research are of practical value to biologists who analyze microarray data sets. The TM4 normalization and filtering pipeline corrects microarray-specific systematic bias and complements the normalization stage in Expresso analysis. The results of Expresso using MIV data have the best agreement with qRT-PCR results. In one experiment, MIV is a better choice than IIV as input to data normalization and statistical analysis methods, as it yields as greater number of statistically significant differentially expressed genes; TM4 does not support the choice of MIV input data. Overall, the more flexible and extensive statistical models of Expresso achieve more accurate analytical results, when judged by the yardstick of qRT-PCR data, in the context of an experimental design of modest complexity. |
format | Text |
id | pubmed-1513403 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2006 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-15134032006-07-21 The statistics of identifying differentially expressed genes in Expresso and TM4: a comparison Sioson, Allan A Mane, Shrinivasrao P Li, Pinghua Sha, Wei Heath, Lenwood S Bohnert, Hans J Grene, Ruth BMC Bioinformatics Research Article BACKGROUND: Analysis of DNA microarray data takes as input spot intensity measurements from scanner software and returns differential expression of genes between two conditions, together with a statistical significance assessment. This process typically consists of two steps: data normalization and identification of differentially expressed genes through statistical analysis. The Expresso microarray experiment management system implements these steps with a two-stage, log-linear ANOVA mixed model technique, tailored to individual experimental designs. The complement of tools in TM4, on the other hand, is based on a number of preset design choices that limit its flexibility. In the TM4 microarray analysis suite, normalization, filter, and analysis methods form an analysis pipeline. TM4 computes integrated intensity values (IIV) from the average intensities and spot pixel counts returned by the scanner software as input to its normalization steps. By contrast, Expresso can use either IIV data or median intensity values (MIV). Here, we compare Expresso and TM4 analysis of two experiments and assess the results against qRT-PCR data. RESULTS: The Expresso analysis using MIV data consistently identifies more genes as differentially expressed, when compared to Expresso analysis with IIV data. The typical TM4 normalization and filtering pipeline corrects systematic intensity-specific bias on a per microarray basis. Subsequent statistical analysis with Expresso or a TM4 t-test can effectively identify differentially expressed genes. The best agreement with qRT-PCR data is obtained through the use of Expresso analysis and MIV data. CONCLUSION: The results of this research are of practical value to biologists who analyze microarray data sets. The TM4 normalization and filtering pipeline corrects microarray-specific systematic bias and complements the normalization stage in Expresso analysis. The results of Expresso using MIV data have the best agreement with qRT-PCR results. In one experiment, MIV is a better choice than IIV as input to data normalization and statistical analysis methods, as it yields as greater number of statistically significant differentially expressed genes; TM4 does not support the choice of MIV input data. Overall, the more flexible and extensive statistical models of Expresso achieve more accurate analytical results, when judged by the yardstick of qRT-PCR data, in the context of an experimental design of modest complexity. BioMed Central 2006-04-20 /pmc/articles/PMC1513403/ /pubmed/16626497 http://dx.doi.org/10.1186/1471-2105-7-215 Text en Copyright © 2006 Sioson et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. |
spellingShingle | Research Article Sioson, Allan A Mane, Shrinivasrao P Li, Pinghua Sha, Wei Heath, Lenwood S Bohnert, Hans J Grene, Ruth The statistics of identifying differentially expressed genes in Expresso and TM4: a comparison |
title | The statistics of identifying differentially expressed genes in Expresso and TM4: a comparison |
title_full | The statistics of identifying differentially expressed genes in Expresso and TM4: a comparison |
title_fullStr | The statistics of identifying differentially expressed genes in Expresso and TM4: a comparison |
title_full_unstemmed | The statistics of identifying differentially expressed genes in Expresso and TM4: a comparison |
title_short | The statistics of identifying differentially expressed genes in Expresso and TM4: a comparison |
title_sort | statistics of identifying differentially expressed genes in expresso and tm4: a comparison |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1513403/ https://www.ncbi.nlm.nih.gov/pubmed/16626497 http://dx.doi.org/10.1186/1471-2105-7-215 |
work_keys_str_mv | AT siosonallana thestatisticsofidentifyingdifferentiallyexpressedgenesinexpressoandtm4acomparison AT maneshrinivasraop thestatisticsofidentifyingdifferentiallyexpressedgenesinexpressoandtm4acomparison AT lipinghua thestatisticsofidentifyingdifferentiallyexpressedgenesinexpressoandtm4acomparison AT shawei thestatisticsofidentifyingdifferentiallyexpressedgenesinexpressoandtm4acomparison AT heathlenwoods thestatisticsofidentifyingdifferentiallyexpressedgenesinexpressoandtm4acomparison AT bohnerthansj thestatisticsofidentifyingdifferentiallyexpressedgenesinexpressoandtm4acomparison AT greneruth thestatisticsofidentifyingdifferentiallyexpressedgenesinexpressoandtm4acomparison AT siosonallana statisticsofidentifyingdifferentiallyexpressedgenesinexpressoandtm4acomparison AT maneshrinivasraop statisticsofidentifyingdifferentiallyexpressedgenesinexpressoandtm4acomparison AT lipinghua statisticsofidentifyingdifferentiallyexpressedgenesinexpressoandtm4acomparison AT shawei statisticsofidentifyingdifferentiallyexpressedgenesinexpressoandtm4acomparison AT heathlenwoods statisticsofidentifyingdifferentiallyexpressedgenesinexpressoandtm4acomparison AT bohnerthansj statisticsofidentifyingdifferentiallyexpressedgenesinexpressoandtm4acomparison AT greneruth statisticsofidentifyingdifferentiallyexpressedgenesinexpressoandtm4acomparison |