Cargando…

Classification of heterogeneous microarray data by maximum entropy kernel

BACKGROUND: There is a large amount of microarray data accumulating in public databases, providing various data waiting to be analyzed jointly. Powerful kernel-based methods are commonly used in microarray analyses with support vector machines (SVMs) to approach a wide range of classification proble...

Descripción completa

Detalles Bibliográficos
Autores principales:	Fujibuchi, Wataru, Kato, Tsuyoshi
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2007
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1994960/ https://www.ncbi.nlm.nih.gov/pubmed/17651507 http://dx.doi.org/10.1186/1471-2105-8-267

_version_	1782135504019390464
author	Fujibuchi, Wataru Kato, Tsuyoshi
author_facet	Fujibuchi, Wataru Kato, Tsuyoshi
author_sort	Fujibuchi, Wataru
collection	PubMed
description	BACKGROUND: There is a large amount of microarray data accumulating in public databases, providing various data waiting to be analyzed jointly. Powerful kernel-based methods are commonly used in microarray analyses with support vector machines (SVMs) to approach a wide range of classification problems. However, the standard vectorial data kernel family (linear, RBF, etc.) that takes vectorial data as input, often fails in prediction if the data come from different platforms or laboratories, due to the low gene overlaps or consistencies between the different datasets. RESULTS: We introduce a new type of kernel called maximum entropy (ME) kernel, which has no pre-defined function but is generated by kernel entropy maximization with sample distance matrices as constraints, into the field of SVM classification of microarray data. We assessed the performance of the ME kernel with three different data: heterogeneous kidney carcinoma, noise-introduced leukemia, and heterogeneous oral cavity carcinoma metastasis data. The results clearly show that the ME kernel is very robust for heterogeneous data containing missing values and high-noise, and gives higher prediction accuracies than the standard kernels, namely, linear, polynomial and RBF. CONCLUSION: The results demonstrate its utility in effectively analyzing promiscuous microarray data of rare specimens, e.g., minor diseases or species, that present difficulty in compiling homogeneous data in a single laboratory.
format	Text
id	pubmed-1994960
institution	National Center for Biotechnology Information
language	English
publishDate	2007
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-19949602007-09-28 Classification of heterogeneous microarray data by maximum entropy kernel Fujibuchi, Wataru Kato, Tsuyoshi BMC Bioinformatics Research Article BACKGROUND: There is a large amount of microarray data accumulating in public databases, providing various data waiting to be analyzed jointly. Powerful kernel-based methods are commonly used in microarray analyses with support vector machines (SVMs) to approach a wide range of classification problems. However, the standard vectorial data kernel family (linear, RBF, etc.) that takes vectorial data as input, often fails in prediction if the data come from different platforms or laboratories, due to the low gene overlaps or consistencies between the different datasets. RESULTS: We introduce a new type of kernel called maximum entropy (ME) kernel, which has no pre-defined function but is generated by kernel entropy maximization with sample distance matrices as constraints, into the field of SVM classification of microarray data. We assessed the performance of the ME kernel with three different data: heterogeneous kidney carcinoma, noise-introduced leukemia, and heterogeneous oral cavity carcinoma metastasis data. The results clearly show that the ME kernel is very robust for heterogeneous data containing missing values and high-noise, and gives higher prediction accuracies than the standard kernels, namely, linear, polynomial and RBF. CONCLUSION: The results demonstrate its utility in effectively analyzing promiscuous microarray data of rare specimens, e.g., minor diseases or species, that present difficulty in compiling homogeneous data in a single laboratory. BioMed Central 2007-07-26 /pmc/articles/PMC1994960/ /pubmed/17651507 http://dx.doi.org/10.1186/1471-2105-8-267 Text en Copyright © 2007 Fujibuchi and Kato; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Fujibuchi, Wataru Kato, Tsuyoshi Classification of heterogeneous microarray data by maximum entropy kernel
title	Classification of heterogeneous microarray data by maximum entropy kernel
title_full	Classification of heterogeneous microarray data by maximum entropy kernel
title_fullStr	Classification of heterogeneous microarray data by maximum entropy kernel
title_full_unstemmed	Classification of heterogeneous microarray data by maximum entropy kernel
title_short	Classification of heterogeneous microarray data by maximum entropy kernel
title_sort	classification of heterogeneous microarray data by maximum entropy kernel
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1994960/ https://www.ncbi.nlm.nih.gov/pubmed/17651507 http://dx.doi.org/10.1186/1471-2105-8-267
work_keys_str_mv	AT fujibuchiwataru classificationofheterogeneousmicroarraydatabymaximumentropykernel AT katotsuyoshi classificationofheterogeneousmicroarraydatabymaximumentropykernel

Classification of heterogeneous microarray data by maximum entropy kernel

Ejemplares similares