Cargando…
A new approach to enhance the performance of decision tree for classifying gene expression data
BACKGROUND: Gene expression data classification is a challenging task due to the large dimensionality and very small number of samples. Decision tree is one of the popular machine learning approaches to address such classification problems. However, the existing decision tree algorithms use a single...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
BioMed Central
2013
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4044984/ https://www.ncbi.nlm.nih.gov/pubmed/24564916 http://dx.doi.org/10.1186/1753-6561-7-S7-S3 |
_version_ | 1782319231242600448 |
---|---|
author | Hassan, Md Rafiul Kotagiri, Ramamohanarao |
author_facet | Hassan, Md Rafiul Kotagiri, Ramamohanarao |
author_sort | Hassan, Md Rafiul |
collection | PubMed |
description | BACKGROUND: Gene expression data classification is a challenging task due to the large dimensionality and very small number of samples. Decision tree is one of the popular machine learning approaches to address such classification problems. However, the existing decision tree algorithms use a single gene feature at each node to split the data into its child nodes and hence might suffer from poor performance specially when classifying gene expression dataset. RESULTS: By using a new decision tree algorithm where, each node of the tree consists of more than one gene, we enhance the classification performance of traditional decision tree classifiers. Our method selects suitable genes that are combined using a linear function to form a derived composite feature. To determine the structure of the tree we use the area under the Receiver Operating Characteristics curve (AUC). Experimental analysis demonstrates higher classification accuracy using the new decision tree compared to the other existing decision trees in literature. CONCLUSION: We experimentally compare the effect of our scheme against other well known decision tree techniques. Experiments show that our algorithm can substantially boost the classification performance of the decision tree. |
format | Online Article Text |
id | pubmed-4044984 |
institution | National Center for Biotechnology Information |
language | English |
publishDate | 2013 |
publisher | BioMed Central |
record_format | MEDLINE/PubMed |
spelling | pubmed-40449842014-06-19 A new approach to enhance the performance of decision tree for classifying gene expression data Hassan, Md Rafiul Kotagiri, Ramamohanarao BMC Proc Proceedings BACKGROUND: Gene expression data classification is a challenging task due to the large dimensionality and very small number of samples. Decision tree is one of the popular machine learning approaches to address such classification problems. However, the existing decision tree algorithms use a single gene feature at each node to split the data into its child nodes and hence might suffer from poor performance specially when classifying gene expression dataset. RESULTS: By using a new decision tree algorithm where, each node of the tree consists of more than one gene, we enhance the classification performance of traditional decision tree classifiers. Our method selects suitable genes that are combined using a linear function to form a derived composite feature. To determine the structure of the tree we use the area under the Receiver Operating Characteristics curve (AUC). Experimental analysis demonstrates higher classification accuracy using the new decision tree compared to the other existing decision trees in literature. CONCLUSION: We experimentally compare the effect of our scheme against other well known decision tree techniques. Experiments show that our algorithm can substantially boost the classification performance of the decision tree. BioMed Central 2013-12-20 /pmc/articles/PMC4044984/ /pubmed/24564916 http://dx.doi.org/10.1186/1753-6561-7-S7-S3 Text en Copyright © 2013 Hassan and Kotagiri; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated. |
spellingShingle | Proceedings Hassan, Md Rafiul Kotagiri, Ramamohanarao A new approach to enhance the performance of decision tree for classifying gene expression data |
title | A new approach to enhance the performance of decision tree for classifying gene expression data |
title_full | A new approach to enhance the performance of decision tree for classifying gene expression data |
title_fullStr | A new approach to enhance the performance of decision tree for classifying gene expression data |
title_full_unstemmed | A new approach to enhance the performance of decision tree for classifying gene expression data |
title_short | A new approach to enhance the performance of decision tree for classifying gene expression data |
title_sort | new approach to enhance the performance of decision tree for classifying gene expression data |
topic | Proceedings |
url | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4044984/ https://www.ncbi.nlm.nih.gov/pubmed/24564916 http://dx.doi.org/10.1186/1753-6561-7-S7-S3 |
work_keys_str_mv | AT hassanmdrafiul anewapproachtoenhancetheperformanceofdecisiontreeforclassifyinggeneexpressiondata AT kotagiriramamohanarao anewapproachtoenhancetheperformanceofdecisiontreeforclassifyinggeneexpressiondata AT hassanmdrafiul newapproachtoenhancetheperformanceofdecisiontreeforclassifyinggeneexpressiondata AT kotagiriramamohanarao newapproachtoenhancetheperformanceofdecisiontreeforclassifyinggeneexpressiondata |