Cargando…

Analyzing Kernel Matrices for the Identification of Differentially Expressed Genes

One of the most important applications of microarray data is the class prediction of biological samples. For this purpose, statistical tests have often been applied to identify the differentially expressed genes (DEGs), followed by the employment of the state-of-the-art learning machines including t...

Descripción completa

Detalles Bibliográficos
Autores principales: Xia, Xiao-Lei, Xing, Huanlai, Liu, Xueqin
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Public Library of Science 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3857896/
https://www.ncbi.nlm.nih.gov/pubmed/24349110
http://dx.doi.org/10.1371/journal.pone.0081683
_version_ 1782295215655092224
author Xia, Xiao-Lei
Xing, Huanlai
Liu, Xueqin
author_facet Xia, Xiao-Lei
Xing, Huanlai
Liu, Xueqin
author_sort Xia, Xiao-Lei
collection PubMed
description One of the most important applications of microarray data is the class prediction of biological samples. For this purpose, statistical tests have often been applied to identify the differentially expressed genes (DEGs), followed by the employment of the state-of-the-art learning machines including the Support Vector Machines (SVM) in particular. The SVM is a typical sample-based classifier whose performance comes down to how discriminant samples are. However, DEGs identified by statistical tests are not guaranteed to result in a training dataset composed of discriminant samples. To tackle this problem, a novel gene ranking method namely the Kernel Matrix Gene Selection (KMGS) is proposed. The rationale of the method, which roots in the fundamental ideas of the SVM algorithm, is described. The notion of ''the separability of a sample'' which is estimated by performing [Image: see text]-like statistics on each column of the kernel matrix, is first introduced. The separability of a classification problem is then measured, from which the significance of a specific gene is deduced. Also described is a method of Kernel Matrix Sequential Forward Selection (KMSFS) which shares the KMGS method's essential ideas but proceeds in a greedy manner. On three public microarray datasets, our proposed algorithms achieved noticeably competitive performance in terms of the B.632+ error rate.
format Online
Article
Text
id pubmed-3857896
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher Public Library of Science
record_format MEDLINE/PubMed
spelling pubmed-38578962013-12-13 Analyzing Kernel Matrices for the Identification of Differentially Expressed Genes Xia, Xiao-Lei Xing, Huanlai Liu, Xueqin PLoS One Research Article One of the most important applications of microarray data is the class prediction of biological samples. For this purpose, statistical tests have often been applied to identify the differentially expressed genes (DEGs), followed by the employment of the state-of-the-art learning machines including the Support Vector Machines (SVM) in particular. The SVM is a typical sample-based classifier whose performance comes down to how discriminant samples are. However, DEGs identified by statistical tests are not guaranteed to result in a training dataset composed of discriminant samples. To tackle this problem, a novel gene ranking method namely the Kernel Matrix Gene Selection (KMGS) is proposed. The rationale of the method, which roots in the fundamental ideas of the SVM algorithm, is described. The notion of ''the separability of a sample'' which is estimated by performing [Image: see text]-like statistics on each column of the kernel matrix, is first introduced. The separability of a classification problem is then measured, from which the significance of a specific gene is deduced. Also described is a method of Kernel Matrix Sequential Forward Selection (KMSFS) which shares the KMGS method's essential ideas but proceeds in a greedy manner. On three public microarray datasets, our proposed algorithms achieved noticeably competitive performance in terms of the B.632+ error rate. Public Library of Science 2013-12-09 /pmc/articles/PMC3857896/ /pubmed/24349110 http://dx.doi.org/10.1371/journal.pone.0081683 Text en © 2013 Xia et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited.
spellingShingle Research Article
Xia, Xiao-Lei
Xing, Huanlai
Liu, Xueqin
Analyzing Kernel Matrices for the Identification of Differentially Expressed Genes
title Analyzing Kernel Matrices for the Identification of Differentially Expressed Genes
title_full Analyzing Kernel Matrices for the Identification of Differentially Expressed Genes
title_fullStr Analyzing Kernel Matrices for the Identification of Differentially Expressed Genes
title_full_unstemmed Analyzing Kernel Matrices for the Identification of Differentially Expressed Genes
title_short Analyzing Kernel Matrices for the Identification of Differentially Expressed Genes
title_sort analyzing kernel matrices for the identification of differentially expressed genes
topic Research Article
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3857896/
https://www.ncbi.nlm.nih.gov/pubmed/24349110
http://dx.doi.org/10.1371/journal.pone.0081683
work_keys_str_mv AT xiaxiaolei analyzingkernelmatricesfortheidentificationofdifferentiallyexpressedgenes
AT xinghuanlai analyzingkernelmatricesfortheidentificationofdifferentiallyexpressedgenes
AT liuxueqin analyzingkernelmatricesfortheidentificationofdifferentiallyexpressedgenes