Cargando…

Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations

BACKGROUND: Gene expression microarray technologies are widely used across most areas of biological and medical research. Comparing and integrating microarray data from different experiments would be very useful, but is currently very challenging due to the experimental and hybridization conditions,...

Descripción completa

Detalles Bibliográficos
Autores principales: Autio, Reija, Kilpinen, Sami, Saarela, Matti, Kallioniemi, Olli, Hautaniemi, Sampsa, Astola, Jaakko
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2009
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2648747/
https://www.ncbi.nlm.nih.gov/pubmed/19208124
http://dx.doi.org/10.1186/1471-2105-10-S1-S24
_version_ 1782164978711658496
author Autio, Reija
Kilpinen, Sami
Saarela, Matti
Kallioniemi, Olli
Hautaniemi, Sampsa
Astola, Jaakko
author_facet Autio, Reija
Kilpinen, Sami
Saarela, Matti
Kallioniemi, Olli
Hautaniemi, Sampsa
Astola, Jaakko
author_sort Autio, Reija
collection PubMed
description BACKGROUND: Gene expression microarray technologies are widely used across most areas of biological and medical research. Comparing and integrating microarray data from different experiments would be very useful, but is currently very challenging due to the experimental and hybridization conditions, as well as data preprocessing and normalization methods. Furthermore, even in the case of the widely-used, industry-standard Affymetrix oligonucleotide microarrays, the various array generations have different probe sets representing different genes, hindering the data integration. RESULTS: In this study our objective is to find systematic approaches to normalize the data emerging from different Affymetrix array generations and from different laboratories. We compare and assess the accuracy of five normalization methods for Affymetrix gene expression data using 6,926 Affymetrix experiments from five array generations. The methods that we compare include 1) standardization, 2) housekeeping gene based normalization, 3) equalized quantile normalization, 4) Weibull distribution based normalization and 5) array generation based gene centering. Our results indicate that the best results are achieved when the data is normalized first within a sample and then between-samples with Array Generation based gene Centering (AGC) normalization. CONCLUSION: We conclude that with the AGC method integrating different Affymetrix datasets results in values that are significantly more comparable across the array generations than in the cases where no array generation based normalization is used. The AGC method was found to be the best method for normalizing the data from several different array generations, and achieve comparable gene values across thousands of samples.
format Text
id pubmed-2648747
institution National Center for Biotechnology Information
language English
publishDate 2009
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-26487472009-03-03 Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations Autio, Reija Kilpinen, Sami Saarela, Matti Kallioniemi, Olli Hautaniemi, Sampsa Astola, Jaakko BMC Bioinformatics Research BACKGROUND: Gene expression microarray technologies are widely used across most areas of biological and medical research. Comparing and integrating microarray data from different experiments would be very useful, but is currently very challenging due to the experimental and hybridization conditions, as well as data preprocessing and normalization methods. Furthermore, even in the case of the widely-used, industry-standard Affymetrix oligonucleotide microarrays, the various array generations have different probe sets representing different genes, hindering the data integration. RESULTS: In this study our objective is to find systematic approaches to normalize the data emerging from different Affymetrix array generations and from different laboratories. We compare and assess the accuracy of five normalization methods for Affymetrix gene expression data using 6,926 Affymetrix experiments from five array generations. The methods that we compare include 1) standardization, 2) housekeeping gene based normalization, 3) equalized quantile normalization, 4) Weibull distribution based normalization and 5) array generation based gene centering. Our results indicate that the best results are achieved when the data is normalized first within a sample and then between-samples with Array Generation based gene Centering (AGC) normalization. CONCLUSION: We conclude that with the AGC method integrating different Affymetrix datasets results in values that are significantly more comparable across the array generations than in the cases where no array generation based normalization is used. The AGC method was found to be the best method for normalizing the data from several different array generations, and achieve comparable gene values across thousands of samples. BioMed Central 2009-01-30 /pmc/articles/PMC2648747/ /pubmed/19208124 http://dx.doi.org/10.1186/1471-2105-10-S1-S24 Text en Copyright © 2009 Autio et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Autio, Reija
Kilpinen, Sami
Saarela, Matti
Kallioniemi, Olli
Hautaniemi, Sampsa
Astola, Jaakko
Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations
title Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations
title_full Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations
title_fullStr Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations
title_full_unstemmed Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations
title_short Comparison of Affymetrix data normalization methods using 6,926 experiments across five array generations
title_sort comparison of affymetrix data normalization methods using 6,926 experiments across five array generations
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2648747/
https://www.ncbi.nlm.nih.gov/pubmed/19208124
http://dx.doi.org/10.1186/1471-2105-10-S1-S24
work_keys_str_mv AT autioreija comparisonofaffymetrixdatanormalizationmethodsusing6926experimentsacrossfivearraygenerations
AT kilpinensami comparisonofaffymetrixdatanormalizationmethodsusing6926experimentsacrossfivearraygenerations
AT saarelamatti comparisonofaffymetrixdatanormalizationmethodsusing6926experimentsacrossfivearraygenerations
AT kallioniemiolli comparisonofaffymetrixdatanormalizationmethodsusing6926experimentsacrossfivearraygenerations
AT hautaniemisampsa comparisonofaffymetrixdatanormalizationmethodsusing6926experimentsacrossfivearraygenerations
AT astolajaakko comparisonofaffymetrixdatanormalizationmethodsusing6926experimentsacrossfivearraygenerations