Cargando…

How to decide? Different methods of calculating gene expression from short oligonucleotide array data will give different results

BACKGROUND: Short oligonucleotide arrays for transcript profiling have been available for several years. Generally, raw data from these arrays are analysed with the aid of the Microarray Analysis Suite or GeneChip Operating Software (MAS or GCOS) from Affymetrix. Recently, more methods to analyse th...

Descripción completa

Detalles Bibliográficos
Autores principales: Millenaar, Frank F, Okyere, John, May, Sean T, van Zanten, Martijn, Voesenek, Laurentius ACJ, Peeters, Anton JM
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2006
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1431565/
https://www.ncbi.nlm.nih.gov/pubmed/16539732
http://dx.doi.org/10.1186/1471-2105-7-137
_version_ 1782127204594876416
author Millenaar, Frank F
Okyere, John
May, Sean T
van Zanten, Martijn
Voesenek, Laurentius ACJ
Peeters, Anton JM
author_facet Millenaar, Frank F
Okyere, John
May, Sean T
van Zanten, Martijn
Voesenek, Laurentius ACJ
Peeters, Anton JM
author_sort Millenaar, Frank F
collection PubMed
description BACKGROUND: Short oligonucleotide arrays for transcript profiling have been available for several years. Generally, raw data from these arrays are analysed with the aid of the Microarray Analysis Suite or GeneChip Operating Software (MAS or GCOS) from Affymetrix. Recently, more methods to analyse the raw data have become available. Ideally all these methods should come up with more or less the same results. We set out to evaluate the different methods and include work on our own data set, in order to test which method gives the most reliable results. RESULTS: Calculating gene expression with 6 different algorithms (MAS5, dChip PMMM, dChip PM, RMA, GC-RMA and PDNN) using the same (Arabidopsis) data, results in different calculated gene expression levels. Consequently, depending on the method used, different genes will be identified as differentially regulated. Surprisingly, there was only 27 to 36% overlap between the different methods. Furthermore, 47.5% of the genes/probe sets showed good correlation between the mismatch and perfect match intensities. CONCLUSION: After comparing six algorithms, RMA gave the most reproducible results and showed the highest correlation coefficients with Real Time RT-PCR data on genes identified as differentially expressed by all methods. However, we were not able to verify, by Real Time RT-PCR, the microarray results for most genes that were solely calculated by RMA. Furthermore, we conclude that subtraction of the mismatch intensity from the perfect match intensity results most likely in a significant underestimation for at least 47.5% of the expression values. Not one algorithm produced significant expression values for genes present in quantities below 1 pmol. If the only purpose of the microarray experiment is to find new candidate genes, and too many genes are found, then mutual exclusion of the genes predicted by contrasting methods can be used to narrow down the list of new candidate genes by 64 to 73%.
format Text
id pubmed-1431565
institution National Center for Biotechnology Information
language English
publishDate 2006
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-14315652006-04-06 How to decide? Different methods of calculating gene expression from short oligonucleotide array data will give different results Millenaar, Frank F Okyere, John May, Sean T van Zanten, Martijn Voesenek, Laurentius ACJ Peeters, Anton JM BMC Bioinformatics Research Article BACKGROUND: Short oligonucleotide arrays for transcript profiling have been available for several years. Generally, raw data from these arrays are analysed with the aid of the Microarray Analysis Suite or GeneChip Operating Software (MAS or GCOS) from Affymetrix. Recently, more methods to analyse the raw data have become available. Ideally all these methods should come up with more or less the same results. We set out to evaluate the different methods and include work on our own data set, in order to test which method gives the most reliable results. RESULTS: Calculating gene expression with 6 different algorithms (MAS5, dChip PMMM, dChip PM, RMA, GC-RMA and PDNN) using the same (Arabidopsis) data, results in different calculated gene expression levels. Consequently, depending on the method used, different genes will be identified as differentially regulated. Surprisingly, there was only 27 to 36% overlap between the different methods. Furthermore, 47.5% of the genes/probe sets showed good correlation between the mismatch and perfect match intensities. CONCLUSION: After comparing six algorithms, RMA gave the most reproducible results and showed the highest correlation coefficients with Real Time RT-PCR data on genes identified as differentially expressed by all methods. However, we were not able to verify, by Real Time RT-PCR, the microarray results for most genes that were solely calculated by RMA. Furthermore, we conclude that subtraction of the mismatch intensity from the perfect match intensity results most likely in a significant underestimation for at least 47.5% of the expression values. Not one algorithm produced significant expression values for genes present in quantities below 1 pmol. If the only purpose of the microarray experiment is to find new candidate genes, and too many genes are found, then mutual exclusion of the genes predicted by contrasting methods can be used to narrow down the list of new candidate genes by 64 to 73%. BioMed Central 2006-03-15 /pmc/articles/PMC1431565/ /pubmed/16539732 http://dx.doi.org/10.1186/1471-2105-7-137 Text en Copyright © 2006 Millenaar et al; licensee BioMed Central Ltd.
spellingShingle Research Article
Millenaar, Frank F
Okyere, John
May, Sean T
van Zanten, Martijn
Voesenek, Laurentius ACJ
Peeters, Anton JM
How to decide? Different methods of calculating gene expression from short oligonucleotide array data will give different results
title How to decide? Different methods of calculating gene expression from short oligonucleotide array data will give different results
title_full How to decide? Different methods of calculating gene expression from short oligonucleotide array data will give different results
title_fullStr How to decide? Different methods of calculating gene expression from short oligonucleotide array data will give different results
title_full_unstemmed How to decide? Different methods of calculating gene expression from short oligonucleotide array data will give different results
title_short How to decide? Different methods of calculating gene expression from short oligonucleotide array data will give different results
title_sort how to decide? different methods of calculating gene expression from short oligonucleotide array data will give different results
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1431565/
https://www.ncbi.nlm.nih.gov/pubmed/16539732
http://dx.doi.org/10.1186/1471-2105-7-137
work_keys_str_mv AT millenaarfrankf howtodecidedifferentmethodsofcalculatinggeneexpressionfromshortoligonucleotidearraydatawillgivedifferentresults
AT okyerejohn howtodecidedifferentmethodsofcalculatinggeneexpressionfromshortoligonucleotidearraydatawillgivedifferentresults
AT mayseant howtodecidedifferentmethodsofcalculatinggeneexpressionfromshortoligonucleotidearraydatawillgivedifferentresults
AT vanzantenmartijn howtodecidedifferentmethodsofcalculatinggeneexpressionfromshortoligonucleotidearraydatawillgivedifferentresults
AT voeseneklaurentiusacj howtodecidedifferentmethodsofcalculatinggeneexpressionfromshortoligonucleotidearraydatawillgivedifferentresults
AT peetersantonjm howtodecidedifferentmethodsofcalculatinggeneexpressionfromshortoligonucleotidearraydatawillgivedifferentresults