Cargando…

A new approach to enhance the performance of decision tree for classifying gene expression data

BACKGROUND: Gene expression data classification is a challenging task due to the large dimensionality and very small number of samples. Decision tree is one of the popular machine learning approaches to address such classification problems. However, the existing decision tree algorithms use a single...

Descripción completa

Detalles Bibliográficos
Autores principales: Hassan, Md Rafiul, Kotagiri, Ramamohanarao
Formato: Online Artículo Texto
Lenguaje:English
Publicado: BioMed Central 2013
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4044984/
https://www.ncbi.nlm.nih.gov/pubmed/24564916
http://dx.doi.org/10.1186/1753-6561-7-S7-S3
_version_ 1782319231242600448
author Hassan, Md Rafiul
Kotagiri, Ramamohanarao
author_facet Hassan, Md Rafiul
Kotagiri, Ramamohanarao
author_sort Hassan, Md Rafiul
collection PubMed
description BACKGROUND: Gene expression data classification is a challenging task due to the large dimensionality and very small number of samples. Decision tree is one of the popular machine learning approaches to address such classification problems. However, the existing decision tree algorithms use a single gene feature at each node to split the data into its child nodes and hence might suffer from poor performance specially when classifying gene expression dataset. RESULTS: By using a new decision tree algorithm where, each node of the tree consists of more than one gene, we enhance the classification performance of traditional decision tree classifiers. Our method selects suitable genes that are combined using a linear function to form a derived composite feature. To determine the structure of the tree we use the area under the Receiver Operating Characteristics curve (AUC). Experimental analysis demonstrates higher classification accuracy using the new decision tree compared to the other existing decision trees in literature. CONCLUSION: We experimentally compare the effect of our scheme against other well known decision tree techniques. Experiments show that our algorithm can substantially boost the classification performance of the decision tree.
format Online
Article
Text
id pubmed-4044984
institution National Center for Biotechnology Information
language English
publishDate 2013
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-40449842014-06-19 A new approach to enhance the performance of decision tree for classifying gene expression data Hassan, Md Rafiul Kotagiri, Ramamohanarao BMC Proc Proceedings BACKGROUND: Gene expression data classification is a challenging task due to the large dimensionality and very small number of samples. Decision tree is one of the popular machine learning approaches to address such classification problems. However, the existing decision tree algorithms use a single gene feature at each node to split the data into its child nodes and hence might suffer from poor performance specially when classifying gene expression dataset. RESULTS: By using a new decision tree algorithm where, each node of the tree consists of more than one gene, we enhance the classification performance of traditional decision tree classifiers. Our method selects suitable genes that are combined using a linear function to form a derived composite feature. To determine the structure of the tree we use the area under the Receiver Operating Characteristics curve (AUC). Experimental analysis demonstrates higher classification accuracy using the new decision tree compared to the other existing decision trees in literature. CONCLUSION: We experimentally compare the effect of our scheme against other well known decision tree techniques. Experiments show that our algorithm can substantially boost the classification performance of the decision tree. BioMed Central 2013-12-20 /pmc/articles/PMC4044984/ /pubmed/24564916 http://dx.doi.org/10.1186/1753-6561-7-S7-S3 Text en Copyright © 2013 Hassan and Kotagiri; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
spellingShingle Proceedings
Hassan, Md Rafiul
Kotagiri, Ramamohanarao
A new approach to enhance the performance of decision tree for classifying gene expression data
title A new approach to enhance the performance of decision tree for classifying gene expression data
title_full A new approach to enhance the performance of decision tree for classifying gene expression data
title_fullStr A new approach to enhance the performance of decision tree for classifying gene expression data
title_full_unstemmed A new approach to enhance the performance of decision tree for classifying gene expression data
title_short A new approach to enhance the performance of decision tree for classifying gene expression data
title_sort new approach to enhance the performance of decision tree for classifying gene expression data
topic Proceedings
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4044984/
https://www.ncbi.nlm.nih.gov/pubmed/24564916
http://dx.doi.org/10.1186/1753-6561-7-S7-S3
work_keys_str_mv AT hassanmdrafiul anewapproachtoenhancetheperformanceofdecisiontreeforclassifyinggeneexpressiondata
AT kotagiriramamohanarao anewapproachtoenhancetheperformanceofdecisiontreeforclassifyinggeneexpressiondata
AT hassanmdrafiul newapproachtoenhancetheperformanceofdecisiontreeforclassifyinggeneexpressiondata
AT kotagiriramamohanarao newapproachtoenhancetheperformanceofdecisiontreeforclassifyinggeneexpressiondata