Cargando…

Normalization and Gene p-Value Estimation: Issues in Microarray Data Processing

INTRODUCTION: Numerous methods exist for basic processing, e.g. normalization, of microarray gene expression data. These methods have an important effect on the final analysis outcome. Therefore, it is crucial to select methods appropriate for a given dataset in order to assure the validity and reli...

Descripción completa

Detalles Bibliográficos
Autores principales: Fundel, Katrin, Küffner, Robert, Aigner, Thomas, Zimmer, Ralf
Formato: Texto
Lenguaje:English
Publicado: Libertas Academica 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2735944/
https://www.ncbi.nlm.nih.gov/pubmed/19812783
_version_ 1782171283981598720
author Fundel, Katrin
Küffner, Robert
Aigner, Thomas
Zimmer, Ralf
author_facet Fundel, Katrin
Küffner, Robert
Aigner, Thomas
Zimmer, Ralf
author_sort Fundel, Katrin
collection PubMed
description INTRODUCTION: Numerous methods exist for basic processing, e.g. normalization, of microarray gene expression data. These methods have an important effect on the final analysis outcome. Therefore, it is crucial to select methods appropriate for a given dataset in order to assure the validity and reliability of expression data analysis. Furthermore, biological interpretation requires expression values for genes, which are often represented by several spots or probe sets on a microarray. How to best integrate spot/probe set values into gene values has so far been a somewhat neglected problem. RESULTS: We present a case study comparing different between-array normalization methods with respect to the identification of differentially expressed genes. Our results show that it is feasible and necessary to use prior knowledge on gene expression measurements to select an adequate normalization method for the given data. Furthermore, we provide evidence that combining spot/probe set p-values into gene p-values for detecting differentially expressed genes has advantages compared to combining expression values for spots/probe sets into gene expression values. The comparison of different methods suggests to use Stouffer’s method for this purpose. The study has been conducted on gene expression experiments investigating human joint cartilage samples of Osteoarthritis related groups: a cDNA microarray (83 samples, four groups) and an Affymetrix (26 samples, two groups) data set. CONCLUSION: The apparently straight forward steps of gene expression data analysis, e.g. between-array normalization and detection of differentially regulated genes, can be accomplished by numerous different methods. We analyzed multiple methods and the possible effects and thereby demonstrate the importance of the single decisions taken during data processing. We give guidelines for evaluating normalization outcomes. An overview of these effects via appropriate measures and plots compared to prior knowledge is essential for the biological interpretation of gene expression measurements.
format Text
id pubmed-2735944
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher Libertas Academica
record_format MEDLINE/PubMed
spelling pubmed-27359442009-09-14 Normalization and Gene p-Value Estimation: Issues in Microarray Data Processing Fundel, Katrin Küffner, Robert Aigner, Thomas Zimmer, Ralf Bioinform Biol Insights Original Research INTRODUCTION: Numerous methods exist for basic processing, e.g. normalization, of microarray gene expression data. These methods have an important effect on the final analysis outcome. Therefore, it is crucial to select methods appropriate for a given dataset in order to assure the validity and reliability of expression data analysis. Furthermore, biological interpretation requires expression values for genes, which are often represented by several spots or probe sets on a microarray. How to best integrate spot/probe set values into gene values has so far been a somewhat neglected problem. RESULTS: We present a case study comparing different between-array normalization methods with respect to the identification of differentially expressed genes. Our results show that it is feasible and necessary to use prior knowledge on gene expression measurements to select an adequate normalization method for the given data. Furthermore, we provide evidence that combining spot/probe set p-values into gene p-values for detecting differentially expressed genes has advantages compared to combining expression values for spots/probe sets into gene expression values. The comparison of different methods suggests to use Stouffer’s method for this purpose. The study has been conducted on gene expression experiments investigating human joint cartilage samples of Osteoarthritis related groups: a cDNA microarray (83 samples, four groups) and an Affymetrix (26 samples, two groups) data set. CONCLUSION: The apparently straight forward steps of gene expression data analysis, e.g. between-array normalization and detection of differentially regulated genes, can be accomplished by numerous different methods. We analyzed multiple methods and the possible effects and thereby demonstrate the importance of the single decisions taken during data processing. We give guidelines for evaluating normalization outcomes. An overview of these effects via appropriate measures and plots compared to prior knowledge is essential for the biological interpretation of gene expression measurements. Libertas Academica 2008-05-28 /pmc/articles/PMC2735944/ /pubmed/19812783 Text en Copyright © 2008 The authors. http://creativecommons.org/licenses/by/3.0 This article is an open-access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).
spellingShingle Original Research
Fundel, Katrin
Küffner, Robert
Aigner, Thomas
Zimmer, Ralf
Normalization and Gene p-Value Estimation: Issues in Microarray Data Processing
title Normalization and Gene p-Value Estimation: Issues in Microarray Data Processing
title_full Normalization and Gene p-Value Estimation: Issues in Microarray Data Processing
title_fullStr Normalization and Gene p-Value Estimation: Issues in Microarray Data Processing
title_full_unstemmed Normalization and Gene p-Value Estimation: Issues in Microarray Data Processing
title_short Normalization and Gene p-Value Estimation: Issues in Microarray Data Processing
title_sort normalization and gene p-value estimation: issues in microarray data processing
topic Original Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2735944/
https://www.ncbi.nlm.nih.gov/pubmed/19812783
work_keys_str_mv AT fundelkatrin normalizationandgenepvalueestimationissuesinmicroarraydataprocessing
AT kuffnerrobert normalizationandgenepvalueestimationissuesinmicroarraydataprocessing
AT aignerthomas normalizationandgenepvalueestimationissuesinmicroarraydataprocessing
AT zimmerralf normalizationandgenepvalueestimationissuesinmicroarraydataprocessing