Cargando…

Novel and simple transformation algorithm for combining microarray data sets

BACKGROUND: With microarray technology, variability in experimental environments such as RNA sources, microarray production, or the use of different platforms, can cause bias. Such systematic differences present a substantial obstacle to the analysis of microarray data, resulting in inconsistent and...

Descripción completa

Detalles Bibliográficos
Autores principales: Kim, Ki-Yeol, Ki, Dong Hyuk, Jeong, Ha Jin, Jeung, Hei-Cheul, Chung, Hyun Cheol, Rha, Sun Young
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2007
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1914088/
https://www.ncbi.nlm.nih.gov/pubmed/17588268
http://dx.doi.org/10.1186/1471-2105-8-218
_version_ 1782134103791894528
author Kim, Ki-Yeol
Ki, Dong Hyuk
Jeong, Ha Jin
Jeung, Hei-Cheul
Chung, Hyun Cheol
Rha, Sun Young
author_facet Kim, Ki-Yeol
Ki, Dong Hyuk
Jeong, Ha Jin
Jeung, Hei-Cheul
Chung, Hyun Cheol
Rha, Sun Young
author_sort Kim, Ki-Yeol
collection PubMed
description BACKGROUND: With microarray technology, variability in experimental environments such as RNA sources, microarray production, or the use of different platforms, can cause bias. Such systematic differences present a substantial obstacle to the analysis of microarray data, resulting in inconsistent and unreliable information. Therefore, one of the most pressing challenges in the field of microarray technology is how to integrate results from different microarray experiments or combine data sets prior to the specific analysis. RESULTS: Two microarray data sets based on a 17k cDNA microarray system were used, consisting of 82 normal colon mucosa and 72 colorectal cancer tissues. Each data set was prepared from either total RNA or amplified mRNA, and the difference of RNA source between these two data sets was detected by ANOVA (Analysis of variance) model. A simple integration method was introduced which was based on the distributions of gene expression ratios among different microarray data sets. The method transformed gene expression ratios into the form of a reference data set on a gene by gene basis. Hierarchical clustering analysis, density and box plots, and mixture scores with correlation coefficients revealed that the two data sets were well intermingled, indicating that the proposed method minimized the experimental bias. In addition, any RNA source effect was not detected by the proposed transformation method. In the mixed data set, two previously identified subgroups of normal and tumor were well separated, and the efficiency of integration was more prominent in tumor groups than normal groups. The transformation method was slightly more effective when a data set with strong homogeneity in the same experimental group was used as a reference data set. CONCLUSION: Proposed method is simple but useful to combine several data sets from different experimental conditions. With this method, biologically useful information can be detectable by applying various analytic methods to the combined data set with increased sample size.
format Text
id pubmed-1914088
institution National Center for Biotechnology Information
language English
publishDate 2007
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-19140882007-07-13 Novel and simple transformation algorithm for combining microarray data sets Kim, Ki-Yeol Ki, Dong Hyuk Jeong, Ha Jin Jeung, Hei-Cheul Chung, Hyun Cheol Rha, Sun Young BMC Bioinformatics Research Article BACKGROUND: With microarray technology, variability in experimental environments such as RNA sources, microarray production, or the use of different platforms, can cause bias. Such systematic differences present a substantial obstacle to the analysis of microarray data, resulting in inconsistent and unreliable information. Therefore, one of the most pressing challenges in the field of microarray technology is how to integrate results from different microarray experiments or combine data sets prior to the specific analysis. RESULTS: Two microarray data sets based on a 17k cDNA microarray system were used, consisting of 82 normal colon mucosa and 72 colorectal cancer tissues. Each data set was prepared from either total RNA or amplified mRNA, and the difference of RNA source between these two data sets was detected by ANOVA (Analysis of variance) model. A simple integration method was introduced which was based on the distributions of gene expression ratios among different microarray data sets. The method transformed gene expression ratios into the form of a reference data set on a gene by gene basis. Hierarchical clustering analysis, density and box plots, and mixture scores with correlation coefficients revealed that the two data sets were well intermingled, indicating that the proposed method minimized the experimental bias. In addition, any RNA source effect was not detected by the proposed transformation method. In the mixed data set, two previously identified subgroups of normal and tumor were well separated, and the efficiency of integration was more prominent in tumor groups than normal groups. The transformation method was slightly more effective when a data set with strong homogeneity in the same experimental group was used as a reference data set. CONCLUSION: Proposed method is simple but useful to combine several data sets from different experimental conditions. With this method, biologically useful information can be detectable by applying various analytic methods to the combined data set with increased sample size. BioMed Central 2007-06-25 /pmc/articles/PMC1914088/ /pubmed/17588268 http://dx.doi.org/10.1186/1471-2105-8-218 Text en Copyright © 2007 Kim et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research Article
Kim, Ki-Yeol
Ki, Dong Hyuk
Jeong, Ha Jin
Jeung, Hei-Cheul
Chung, Hyun Cheol
Rha, Sun Young
Novel and simple transformation algorithm for combining microarray data sets
title Novel and simple transformation algorithm for combining microarray data sets
title_full Novel and simple transformation algorithm for combining microarray data sets
title_fullStr Novel and simple transformation algorithm for combining microarray data sets
title_full_unstemmed Novel and simple transformation algorithm for combining microarray data sets
title_short Novel and simple transformation algorithm for combining microarray data sets
title_sort novel and simple transformation algorithm for combining microarray data sets
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1914088/
https://www.ncbi.nlm.nih.gov/pubmed/17588268
http://dx.doi.org/10.1186/1471-2105-8-218
work_keys_str_mv AT kimkiyeol novelandsimpletransformationalgorithmforcombiningmicroarraydatasets
AT kidonghyuk novelandsimpletransformationalgorithmforcombiningmicroarraydatasets
AT jeonghajin novelandsimpletransformationalgorithmforcombiningmicroarraydatasets
AT jeungheicheul novelandsimpletransformationalgorithmforcombiningmicroarraydatasets
AT chunghyuncheol novelandsimpletransformationalgorithmforcombiningmicroarraydatasets
AT rhasunyoung novelandsimpletransformationalgorithmforcombiningmicroarraydatasets