Cargando…

Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes

BACKGROUND: The extensive use of DNA microarray technology in the characterization of the cell transcriptome is leading to an ever increasing amount of microarray data from cancer studies. Although similar questions for the same type of cancer are addressed in these different studies, a comparative...

Descripción completa

Detalles Bibliográficos
Autores principales:	Warnat, Patrick, Eils, Roland, Brors, Benedikt
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2005
Materias:	Methodology Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1312314/ https://www.ncbi.nlm.nih.gov/pubmed/16271137 http://dx.doi.org/10.1186/1471-2105-6-265

_version_	1782126345850978304
author	Warnat, Patrick Eils, Roland Brors, Benedikt
author_facet	Warnat, Patrick Eils, Roland Brors, Benedikt
author_sort	Warnat, Patrick
collection	PubMed
description	BACKGROUND: The extensive use of DNA microarray technology in the characterization of the cell transcriptome is leading to an ever increasing amount of microarray data from cancer studies. Although similar questions for the same type of cancer are addressed in these different studies, a comparative analysis of their results is hampered by the use of heterogeneous microarray platforms and analysis methods. RESULTS: In contrast to a meta-analysis approach where results of different studies are combined on an interpretative level, we investigate here how to directly integrate raw microarray data from different studies for the purpose of supervised classification analysis. We use median rank scores and quantile discretization to derive numerically comparable measures of gene expression from different platforms. These transformed data are then used for training of classifiers based on support vector machines. We apply this approach to six publicly available cancer microarray gene expression data sets, which consist of three pairs of studies, each examining the same type of cancer, i.e. breast cancer, prostate cancer or acute myeloid leukemia. For each pair, one study was performed by means of cDNA microarrays and the other by means of oligonucleotide microarrays. In each pair, high classification accuracies (> 85%) were achieved with training and testing on data instances randomly chosen from both data sets in a cross-validation analysis. To exemplify the potential of this cross-platform classification analysis, we use two leukemia microarray data sets to show that important genes with regard to the biology of leukemia are selected in an integrated analysis, which are missed in either single-set analysis. CONCLUSION: Cross-platform classification of multiple cancer microarray data sets yields discriminative gene expression signatures that are found and validated on a large number of microarray samples, generated by different laboratories and microarray technologies. Predictive models generated by this approach are better validated than those generated on a single data set, while showing high predictive power and improved generalization performance.
format	Text
id	pubmed-1312314
institution	National Center for Biotechnology Information
language	English
publishDate	2005
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-13123142005-12-14 Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes Warnat, Patrick Eils, Roland Brors, Benedikt BMC Bioinformatics Methodology Article BACKGROUND: The extensive use of DNA microarray technology in the characterization of the cell transcriptome is leading to an ever increasing amount of microarray data from cancer studies. Although similar questions for the same type of cancer are addressed in these different studies, a comparative analysis of their results is hampered by the use of heterogeneous microarray platforms and analysis methods. RESULTS: In contrast to a meta-analysis approach where results of different studies are combined on an interpretative level, we investigate here how to directly integrate raw microarray data from different studies for the purpose of supervised classification analysis. We use median rank scores and quantile discretization to derive numerically comparable measures of gene expression from different platforms. These transformed data are then used for training of classifiers based on support vector machines. We apply this approach to six publicly available cancer microarray gene expression data sets, which consist of three pairs of studies, each examining the same type of cancer, i.e. breast cancer, prostate cancer or acute myeloid leukemia. For each pair, one study was performed by means of cDNA microarrays and the other by means of oligonucleotide microarrays. In each pair, high classification accuracies (> 85%) were achieved with training and testing on data instances randomly chosen from both data sets in a cross-validation analysis. To exemplify the potential of this cross-platform classification analysis, we use two leukemia microarray data sets to show that important genes with regard to the biology of leukemia are selected in an integrated analysis, which are missed in either single-set analysis. CONCLUSION: Cross-platform classification of multiple cancer microarray data sets yields discriminative gene expression signatures that are found and validated on a large number of microarray samples, generated by different laboratories and microarray technologies. Predictive models generated by this approach are better validated than those generated on a single data set, while showing high predictive power and improved generalization performance. BioMed Central 2005-11-04 /pmc/articles/PMC1312314/ /pubmed/16271137 http://dx.doi.org/10.1186/1471-2105-6-265 Text en Copyright © 2005 Warnat et al; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Methodology Article Warnat, Patrick Eils, Roland Brors, Benedikt Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes
title	Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes
title_full	Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes
title_fullStr	Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes
title_full_unstemmed	Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes
title_short	Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes
title_sort	cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes
topic	Methodology Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1312314/ https://www.ncbi.nlm.nih.gov/pubmed/16271137 http://dx.doi.org/10.1186/1471-2105-6-265
work_keys_str_mv	AT warnatpatrick crossplatformanalysisofcancermicroarraydataimprovesgeneexpressionbasedclassificationofphenotypes AT eilsroland crossplatformanalysisofcancermicroarraydataimprovesgeneexpressionbasedclassificationofphenotypes AT brorsbenedikt crossplatformanalysisofcancermicroarraydataimprovesgeneexpressionbasedclassificationofphenotypes

Cross-platform analysis of cancer microarray data improves gene expression based classification of phenotypes

Ejemplares similares