Cargando…
Analyzing Kernel Matrices for the Identification of Differentially Expressed Genes
One of the most important applications of microarray data is the class prediction of biological samples. For this purpose, statistical tests have often been applied to identify the differentially expressed genes (DEGs), followed by the employment of the state-of-the-art learning machines including t...
Autores principales: | , , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
Public Library of Science
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3857896/ https://www.ncbi.nlm.nih.gov/pubmed/24349110 http://dx.doi.org/10.1371/journal.pone.0081683 |
_version_ | 1782295215655092224 |
---|---|
author | Xia, Xiao-Lei Xing, Huanlai Liu, Xueqin |
author_facet | Xia, Xiao-Lei Xing, Huanlai Liu, Xueqin |
author_sort | Xia, Xiao-Lei |
collection | PubMed |
description | One of the most important applications of microarray data is the class prediction of biological samples. For this purpose, statistical tests have often been applied to identify the differentially expressed genes (DEGs), followed by the employment of the state-of-the-art learning machines including the Support Vector Machines (SVM) in particular. The SVM is a typical sample-based classifier whose performance comes down to how discriminant samples are. However, DEGs identified by statistical tests are not guaranteed to result in a training dataset composed of discriminant samples. To tackle this problem, a novel gene ranking method namely the Kernel Matrix Gene Selection (KMGS) is proposed. The rationale of the method, which roots in the fundamental ideas of the SVM algorithm, is described. The notion of ''the separability of a sample'' which is estimated by performing [Image: see text]-like statistics on each column of the kernel matrix, is first introduced. The separability of a classification problem is then measured, from which the significance of a specific gene is deduced. Also described is a method of Kernel Matrix Sequential Forward Selection (KMSFS) which shares the KMGS method's essential ideas but proceeds in a greedy manner. On three public microarray datasets, our proposed algorithms achieved noticeably competitive performance in terms of the B.632+ error rate. |
format | Online Article Text |
id | pubmed-3857896 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | Public Library of Science |
record_format | MEDLINE/PubMed |
spelling | pubmed-38578962013-12-13 Analyzing Kernel Matrices for the Identification of Differentially Expressed Genes Xia, Xiao-Lei Xing, Huanlai Liu, Xueqin PLoS One Research Article One of the most important applications of microarray data is the class prediction of biological samples. For this purpose, statistical tests have often been applied to identify the differentially expressed genes (DEGs), followed by the employment of the state-of-the-art learning machines including the Support Vector Machines (SVM) in particular. The SVM is a typical sample-based classifier whose performance comes down to how discriminant samples are. However, DEGs identified by statistical tests are not guaranteed to result in a training dataset composed of discriminant samples. To tackle this problem, a novel gene ranking method namely the Kernel Matrix Gene Selection (KMGS) is proposed. The rationale of the method, which roots in the fundamental ideas of the SVM algorithm, is described. The notion of ''the separability of a sample'' which is estimated by performing [Image: see text]-like statistics on each column of the kernel matrix, is first introduced. The separability of a classification problem is then measured, from which the significance of a specific gene is deduced. Also described is a method of Kernel Matrix Sequential Forward Selection (KMSFS) which shares the KMGS method's essential ideas but proceeds in a greedy manner. On three public microarray datasets, our proposed algorithms achieved noticeably competitive performance in terms of the B.632+ error rate. Public Library of Science 2013-12-09 /pmc/articles/PMC3857896/ /pubmed/24349110 http://dx.doi.org/10.1371/journal.pone.0081683 Text en © 2013 Xia et al http://creativecommons.org/licenses/by/4.0/ This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are properly credited. |
spellingShingle | Research Article Xia, Xiao-Lei Xing, Huanlai Liu, Xueqin Analyzing Kernel Matrices for the Identification of Differentially Expressed Genes |
title | Analyzing Kernel Matrices for the Identification of Differentially Expressed Genes |
title_full | Analyzing Kernel Matrices for the Identification of Differentially Expressed Genes |
title_fullStr | Analyzing Kernel Matrices for the Identification of Differentially Expressed Genes |
title_full_unstemmed | Analyzing Kernel Matrices for the Identification of Differentially Expressed Genes |
title_short | Analyzing Kernel Matrices for the Identification of Differentially Expressed Genes |
title_sort | analyzing kernel matrices for the identification of differentially expressed genes |
topic | Research Article |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3857896/ https://www.ncbi.nlm.nih.gov/pubmed/24349110 http://dx.doi.org/10.1371/journal.pone.0081683 |
work_keys_str_mv | AT xiaxiaolei analyzingkernelmatricesfortheidentificationofdifferentiallyexpressedgenes AT xinghuanlai analyzingkernelmatricesfortheidentificationofdifferentiallyexpressedgenes AT liuxueqin analyzingkernelmatricesfortheidentificationofdifferentiallyexpressedgenes |