Cargando…
How to decide? Different methods of calculating gene expression from short oligonucleotide array data will give different results
BACKGROUND: Short oligonucleotide arrays for transcript profiling have been available for several years. Generally, raw data from these arrays are analysed with the aid of the Microarray Analysis Suite or GeneChip Operating Software (MAS or GCOS) from Affymetrix. Recently, more methods to analyse th...
Autores principales: | , , , , , |
---|---|
Formato: | Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2006
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1431565/ https://www.ncbi.nlm.nih.gov/pubmed/16539732 http://dx.doi.org/10.1186/1471-2105-7-137 |
_version_ | 1782127204594876416 |
---|---|
author | Millenaar, Frank F Okyere, John May, Sean T van Zanten, Martijn Voesenek, Laurentius ACJ Peeters, Anton JM |
author_facet | Millenaar, Frank F Okyere, John May, Sean T van Zanten, Martijn Voesenek, Laurentius ACJ Peeters, Anton JM |
author_sort | Millenaar, Frank F |
collection | PubMed |
description | BACKGROUND: Short oligonucleotide arrays for transcript profiling have been available for several years. Generally, raw data from these arrays are analysed with the aid of the Microarray Analysis Suite or GeneChip Operating Software (MAS or GCOS) from Affymetrix. Recently, more methods to analyse the raw data have become available. Ideally all these methods should come up with more or less the same results. We set out to evaluate the different methods and include work on our own data set, in order to test which method gives the most reliable results. RESULTS: Calculating gene expression with 6 different algorithms (MAS5, dChip PMMM, dChip PM, RMA, GC-RMA and PDNN) using the same (Arabidopsis) data, results in different calculated gene expression levels. Consequently, depending on the method used, different genes will be identified as differentially regulated. Surprisingly, there was only 27 to 36% overlap between the different methods. Furthermore, 47.5% of the genes/probe sets showed good correlation between the mismatch and perfect match intensities. CONCLUSION: After comparing six algorithms, RMA gave the most reproducible results and showed the highest correlation coefficients with Real Time RT-PCR data on genes identified as differentially expressed by all methods. However, we were not able to verify, by Real Time RT-PCR, the microarray results for most genes that were solely calculated by RMA. Furthermore, we conclude that subtraction of the mismatch intensity from the perfect match intensity results most likely in a significant underestimation for at least 47.5% of the expression values. Not one algorithm produced significant expression values for genes present in quantities below 1 pmol. If the only purpose of the microarray experiment is to find new candidate genes, and too many genes are found, then mutual exclusion of the genes predicted by contrasting methods can be used to narrow down the list of new candidate genes by 64 to 73%. |
format | Text |
id | pubmed-1431565 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2006 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-14315652006-04-06 How to decide? Different methods of calculating gene expression from short oligonucleotide array data will give different results Millenaar, Frank F Okyere, John May, Sean T van Zanten, Martijn Voesenek, Laurentius ACJ Peeters, Anton JM BMC Bioinformatics Research Article BACKGROUND: Short oligonucleotide arrays for transcript profiling have been available for several years. Generally, raw data from these arrays are analysed with the aid of the Microarray Analysis Suite or GeneChip Operating Software (MAS or GCOS) from Affymetrix. Recently, more methods to analyse the raw data have become available. Ideally all these methods should come up with more or less the same results. We set out to evaluate the different methods and include work on our own data set, in order to test which method gives the most reliable results. RESULTS: Calculating gene expression with 6 different algorithms (MAS5, dChip PMMM, dChip PM, RMA, GC-RMA and PDNN) using the same (Arabidopsis) data, results in different calculated gene expression levels. Consequently, depending on the method used, different genes will be identified as differentially regulated. Surprisingly, there was only 27 to 36% overlap between the different methods. Furthermore, 47.5% of the genes/probe sets showed good correlation between the mismatch and perfect match intensities. CONCLUSION: After comparing six algorithms, RMA gave the most reproducible results and showed the highest correlation coefficients with Real Time RT-PCR data on genes identified as differentially expressed by all methods. However, we were not able to verify, by Real Time RT-PCR, the microarray results for most genes that were solely calculated by RMA. Furthermore, we conclude that subtraction of the mismatch intensity from the perfect match intensity results most likely in a significant underestimation for at least 47.5% of the expression values. Not one algorithm produced significant expression values for genes present in quantities below 1 pmol. If the only purpose of the microarray experiment is to find new candidate genes, and too many genes are found, then mutual exclusion of the genes predicted by contrasting methods can be used to narrow down the list of new candidate genes by 64 to 73%. BioMed Central 2006-03-15 /pmc/articles/PMC1431565/ /pubmed/16539732 http://dx.doi.org/10.1186/1471-2105-7-137 Text en Copyright © 2006 Millenaar et al; licensee BioMed Central Ltd. |
spellingShingle | Research Article Millenaar, Frank F Okyere, John May, Sean T van Zanten, Martijn Voesenek, Laurentius ACJ Peeters, Anton JM How to decide? Different methods of calculating gene expression from short oligonucleotide array data will give different results |
title | How to decide? Different methods of calculating gene expression from short oligonucleotide array data will give different results |
title_full | How to decide? Different methods of calculating gene expression from short oligonucleotide array data will give different results |
title_fullStr | How to decide? Different methods of calculating gene expression from short oligonucleotide array data will give different results |
title_full_unstemmed | How to decide? Different methods of calculating gene expression from short oligonucleotide array data will give different results |
title_short | How to decide? Different methods of calculating gene expression from short oligonucleotide array data will give different results |
title_sort | how to decide? different methods of calculating gene expression from short oligonucleotide array data will give different results |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1431565/ https://www.ncbi.nlm.nih.gov/pubmed/16539732 http://dx.doi.org/10.1186/1471-2105-7-137 |
work_keys_str_mv | AT millenaarfrankf howtodecidedifferentmethodsofcalculatinggeneexpressionfromshortoligonucleotidearraydatawillgivedifferentresults AT okyerejohn howtodecidedifferentmethodsofcalculatinggeneexpressionfromshortoligonucleotidearraydatawillgivedifferentresults AT mayseant howtodecidedifferentmethodsofcalculatinggeneexpressionfromshortoligonucleotidearraydatawillgivedifferentresults AT vanzantenmartijn howtodecidedifferentmethodsofcalculatinggeneexpressionfromshortoligonucleotidearraydatawillgivedifferentresults AT voeseneklaurentiusacj howtodecidedifferentmethodsofcalculatinggeneexpressionfromshortoligonucleotidearraydatawillgivedifferentresults AT peetersantonjm howtodecidedifferentmethodsofcalculatinggeneexpressionfromshortoligonucleotidearraydatawillgivedifferentresults |