Cargando…

Global rank-invariant set normalization (GRSN) to reduce systematic distortions in microarray data

BACKGROUND: Microarray technology has become very popular for globally evaluating gene expression in biological samples. However, non-linear variation associated with the technology can make data interpretation unreliable. Therefore, methods to correct this kind of technical variation are critical....

Descripción completa

Detalles Bibliográficos
Autores principales: Pelz, Carl R, Kulesz-Martin, Molly, Bagby, Grover, Sears, Rosalie C
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2644708/
https://www.ncbi.nlm.nih.gov/pubmed/19055840
http://dx.doi.org/10.1186/1471-2105-9-520
_version_ 1782164752532766720
author Pelz, Carl R
Kulesz-Martin, Molly
Bagby, Grover
Sears, Rosalie C
author_facet Pelz, Carl R
Kulesz-Martin, Molly
Bagby, Grover
Sears, Rosalie C
author_sort Pelz, Carl R
collection PubMed
description BACKGROUND: Microarray technology has become very popular for globally evaluating gene expression in biological samples. However, non-linear variation associated with the technology can make data interpretation unreliable. Therefore, methods to correct this kind of technical variation are critical. Here we consider a method to reduce this type of variation applied after three common procedures for processing microarray data: MAS 5.0, RMA, and dChip(®). RESULTS: We commonly observe intensity-dependent technical variation between samples in a single microarray experiment. This is most common when MAS 5.0 is used to process probe level data, but we also see this type of technical variation with RMA and dChip(® )processed data. Datasets with unbalanced numbers of up and down regulated genes seem to be particularly susceptible to this type of intensity-dependent technical variation. Unbalanced gene regulation is common when studying cancer samples or genetically manipulated animal models and preservation of this biologically relevant information, while removing technical variation has not been well addressed in the literature. We propose a method based on using rank-invariant, endogenous transcripts as reference points for normalization (GRSN). While the use of rank-invariant transcripts has been described previously, we have added to this concept by the creation of a global rank-invariant set of transcripts used to generate a robust average reference that is used to normalize all samples within a dataset. The global rank-invariant set is selected in an iterative manner so as to preserve unbalanced gene expression. Moreover, our method works well as an overlay that can be applied to data already processed with other probe set summary methods. We demonstrate that this additional normalization step at the "probe set level" effectively corrects a specific type of technical variation that often distorts samples in datasets. CONCLUSION: We have developed a simple post-processing tool to help detect and correct non-linear technical variation in microarray data and demonstrate how it can reduce technical variation and improve the results of downstream statistical gene selection and pathway identification methods.
format Text
id pubmed-2644708
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-26447082009-02-19 Global rank-invariant set normalization (GRSN) to reduce systematic distortions in microarray data Pelz, Carl R Kulesz-Martin, Molly Bagby, Grover Sears, Rosalie C BMC Bioinformatics Methodology Article BACKGROUND: Microarray technology has become very popular for globally evaluating gene expression in biological samples. However, non-linear variation associated with the technology can make data interpretation unreliable. Therefore, methods to correct this kind of technical variation are critical. Here we consider a method to reduce this type of variation applied after three common procedures for processing microarray data: MAS 5.0, RMA, and dChip(®). RESULTS: We commonly observe intensity-dependent technical variation between samples in a single microarray experiment. This is most common when MAS 5.0 is used to process probe level data, but we also see this type of technical variation with RMA and dChip(® )processed data. Datasets with unbalanced numbers of up and down regulated genes seem to be particularly susceptible to this type of intensity-dependent technical variation. Unbalanced gene regulation is common when studying cancer samples or genetically manipulated animal models and preservation of this biologically relevant information, while removing technical variation has not been well addressed in the literature. We propose a method based on using rank-invariant, endogenous transcripts as reference points for normalization (GRSN). While the use of rank-invariant transcripts has been described previously, we have added to this concept by the creation of a global rank-invariant set of transcripts used to generate a robust average reference that is used to normalize all samples within a dataset. The global rank-invariant set is selected in an iterative manner so as to preserve unbalanced gene expression. Moreover, our method works well as an overlay that can be applied to data already processed with other probe set summary methods. We demonstrate that this additional normalization step at the "probe set level" effectively corrects a specific type of technical variation that often distorts samples in datasets. CONCLUSION: We have developed a simple post-processing tool to help detect and correct non-linear technical variation in microarray data and demonstrate how it can reduce technical variation and improve the results of downstream statistical gene selection and pathway identification methods. BioMed Central 2008-12-04 /pmc/articles/PMC2644708/ /pubmed/19055840 http://dx.doi.org/10.1186/1471-2105-9-520 Text en Copyright © 2008 Pelz et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Methodology Article
Pelz, Carl R
Kulesz-Martin, Molly
Bagby, Grover
Sears, Rosalie C
Global rank-invariant set normalization (GRSN) to reduce systematic distortions in microarray data
title Global rank-invariant set normalization (GRSN) to reduce systematic distortions in microarray data
title_full Global rank-invariant set normalization (GRSN) to reduce systematic distortions in microarray data
title_fullStr Global rank-invariant set normalization (GRSN) to reduce systematic distortions in microarray data
title_full_unstemmed Global rank-invariant set normalization (GRSN) to reduce systematic distortions in microarray data
title_short Global rank-invariant set normalization (GRSN) to reduce systematic distortions in microarray data
title_sort global rank-invariant set normalization (grsn) to reduce systematic distortions in microarray data
topic Methodology Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2644708/
https://www.ncbi.nlm.nih.gov/pubmed/19055840
http://dx.doi.org/10.1186/1471-2105-9-520
work_keys_str_mv AT pelzcarlr globalrankinvariantsetnormalizationgrsntoreducesystematicdistortionsinmicroarraydata
AT kuleszmartinmolly globalrankinvariantsetnormalizationgrsntoreducesystematicdistortionsinmicroarraydata
AT bagbygrover globalrankinvariantsetnormalizationgrsntoreducesystematicdistortionsinmicroarraydata
AT searsrosaliec globalrankinvariantsetnormalizationgrsntoreducesystematicdistortionsinmicroarraydata