Cargando…

Random forest for gene selection and microarray data classification

A random forest method has been selected to perform both gene selection and classification of the microarray data. In this embedded method, the selection of smallest possible sets of genes with lowest error rates is the key factor in achieving highest classification accuracy. Hence, improved gene se...

Descripción completa

Detalles Bibliográficos
Autores principales: Moorthy, Kohbalan, Mohamad, Mohd Saberi
Formato: Online Artículo Texto
Lenguaje:English
Publicado: Biomedical Informatics 2011
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3218317/
https://www.ncbi.nlm.nih.gov/pubmed/22125385
_version_ 1782216689333567488
author Moorthy, Kohbalan
Mohamad, Mohd Saberi
author_facet Moorthy, Kohbalan
Mohamad, Mohd Saberi
author_sort Moorthy, Kohbalan
collection PubMed
description A random forest method has been selected to perform both gene selection and classification of the microarray data. In this embedded method, the selection of smallest possible sets of genes with lowest error rates is the key factor in achieving highest classification accuracy. Hence, improved gene selection method using random forest has been proposed to obtain the smallest subset of genes as well as biggest subset of genes prior to classification. The option for biggest subset selection is done to assist researchers who intend to use the informative genes for further research. Enhanced random forest gene selection has performed better in terms of selecting the smallest subset as well as biggest subset of informative genes with lowest out of bag error rates through gene selection. Furthermore, the classification performed on the selected subset of genes using random forest has lead to lower prediction error rates compared to existing method and other similar available methods.
format Online
Article
Text
id pubmed-3218317
institution National Center for Biotechnology Information
language English
publishDate 2011
publisher Biomedical Informatics
record_format MEDLINE/PubMed
spelling pubmed-32183172011-11-28 Random forest for gene selection and microarray data classification Moorthy, Kohbalan Mohamad, Mohd Saberi Bioinformation Hypothesis A random forest method has been selected to perform both gene selection and classification of the microarray data. In this embedded method, the selection of smallest possible sets of genes with lowest error rates is the key factor in achieving highest classification accuracy. Hence, improved gene selection method using random forest has been proposed to obtain the smallest subset of genes as well as biggest subset of genes prior to classification. The option for biggest subset selection is done to assist researchers who intend to use the informative genes for further research. Enhanced random forest gene selection has performed better in terms of selecting the smallest subset as well as biggest subset of informative genes with lowest out of bag error rates through gene selection. Furthermore, the classification performed on the selected subset of genes using random forest has lead to lower prediction error rates compared to existing method and other similar available methods. Biomedical Informatics 2011-09-28 /pmc/articles/PMC3218317/ /pubmed/22125385 Text en © 2011 Biomedical Informatics This is an open-access article, which permits unrestricted use, distribution, and reproduction in any medium, for non-commercial purposes, provided the original author and source are credited.
spellingShingle Hypothesis
Moorthy, Kohbalan
Mohamad, Mohd Saberi
Random forest for gene selection and microarray data classification
title Random forest for gene selection and microarray data classification
title_full Random forest for gene selection and microarray data classification
title_fullStr Random forest for gene selection and microarray data classification
title_full_unstemmed Random forest for gene selection and microarray data classification
title_short Random forest for gene selection and microarray data classification
title_sort random forest for gene selection and microarray data classification
topic Hypothesis
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3218317/
https://www.ncbi.nlm.nih.gov/pubmed/22125385
work_keys_str_mv AT moorthykohbalan randomforestforgeneselectionandmicroarraydataclassification
AT mohamadmohdsaberi randomforestforgeneselectionandmicroarraydataclassification