Cargando…

Accurate molecular classification of cancer using simple rules

BACKGROUND: One intractable problem with using microarray data analysis for cancer classification is how to reduce the extremely high-dimensionality gene feature data to remove the effects of noise. Feature selection is often used to address this problem by selecting informative genes from among tho...

Descripción completa

Detalles Bibliográficos
Autores principales:	Wang, Xiaosheng, Gotoh, Osamu
Formato:	Texto
Lenguaje:	English
Publicado:	BioMed Central 2009
Materias:	Research Article
Acceso en línea:	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2777919/ https://www.ncbi.nlm.nih.gov/pubmed/19874631 http://dx.doi.org/10.1186/1755-8794-2-64

_version_	1782174214929776640
author	Wang, Xiaosheng Gotoh, Osamu
author_facet	Wang, Xiaosheng Gotoh, Osamu
author_sort	Wang, Xiaosheng
collection	PubMed
description	BACKGROUND: One intractable problem with using microarray data analysis for cancer classification is how to reduce the extremely high-dimensionality gene feature data to remove the effects of noise. Feature selection is often used to address this problem by selecting informative genes from among thousands or tens of thousands of genes. However, most of the existing methods of microarray-based cancer classification utilize too many genes to achieve accurate classification, which often hampers the interpretability of the models. For a better understanding of the classification results, it is desirable to develop simpler rule-based models with as few marker genes as possible. METHODS: We screened a small number of informative single genes and gene pairs on the basis of their depended degrees proposed in rough sets. Applying the decision rules induced by the selected genes or gene pairs, we constructed cancer classifiers. We tested the efficacy of the classifiers by leave-one-out cross-validation (LOOCV) of training sets and classification of independent test sets. RESULTS: We applied our methods to five cancerous gene expression datasets: leukemia (acute lymphoblastic leukemia [ALL] vs. acute myeloid leukemia [AML]), lung cancer, prostate cancer, breast cancer, and leukemia (ALL vs. mixed-lineage leukemia [MLL] vs. AML). Accurate classification outcomes were obtained by utilizing just one or two genes. Some genes that correlated closely with the pathogenesis of relevant cancers were identified. In terms of both classification performance and algorithm simplicity, our approach outperformed or at least matched existing methods. CONCLUSION: In cancerous gene expression datasets, a small number of genes, even one or two if selected correctly, is capable of achieving an ideal cancer classification effect. This finding also means that very simple rules may perform well for cancerous class prediction.
format	Text
id	pubmed-2777919
institution	National Center for Biotechnology Information
language	English
publishDate	2009
publisher	BioMed Central
record_format	MEDLINE/PubMed
spelling	pubmed-27779192009-11-17 Accurate molecular classification of cancer using simple rules Wang, Xiaosheng Gotoh, Osamu BMC Med Genomics Research Article BACKGROUND: One intractable problem with using microarray data analysis for cancer classification is how to reduce the extremely high-dimensionality gene feature data to remove the effects of noise. Feature selection is often used to address this problem by selecting informative genes from among thousands or tens of thousands of genes. However, most of the existing methods of microarray-based cancer classification utilize too many genes to achieve accurate classification, which often hampers the interpretability of the models. For a better understanding of the classification results, it is desirable to develop simpler rule-based models with as few marker genes as possible. METHODS: We screened a small number of informative single genes and gene pairs on the basis of their depended degrees proposed in rough sets. Applying the decision rules induced by the selected genes or gene pairs, we constructed cancer classifiers. We tested the efficacy of the classifiers by leave-one-out cross-validation (LOOCV) of training sets and classification of independent test sets. RESULTS: We applied our methods to five cancerous gene expression datasets: leukemia (acute lymphoblastic leukemia [ALL] vs. acute myeloid leukemia [AML]), lung cancer, prostate cancer, breast cancer, and leukemia (ALL vs. mixed-lineage leukemia [MLL] vs. AML). Accurate classification outcomes were obtained by utilizing just one or two genes. Some genes that correlated closely with the pathogenesis of relevant cancers were identified. In terms of both classification performance and algorithm simplicity, our approach outperformed or at least matched existing methods. CONCLUSION: In cancerous gene expression datasets, a small number of genes, even one or two if selected correctly, is capable of achieving an ideal cancer classification effect. This finding also means that very simple rules may perform well for cancerous class prediction. BioMed Central 2009-10-30 /pmc/articles/PMC2777919/ /pubmed/19874631 http://dx.doi.org/10.1186/1755-8794-2-64 Text en Copyright © 2009 Wang and Gotoh; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle	Research Article Wang, Xiaosheng Gotoh, Osamu Accurate molecular classification of cancer using simple rules
title	Accurate molecular classification of cancer using simple rules
title_full	Accurate molecular classification of cancer using simple rules
title_fullStr	Accurate molecular classification of cancer using simple rules
title_full_unstemmed	Accurate molecular classification of cancer using simple rules
title_short	Accurate molecular classification of cancer using simple rules
title_sort	accurate molecular classification of cancer using simple rules
topic	Research Article
url	https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2777919/ https://www.ncbi.nlm.nih.gov/pubmed/19874631 http://dx.doi.org/10.1186/1755-8794-2-64
work_keys_str_mv	AT wangxiaosheng accuratemolecularclassificationofcancerusingsimplerules AT gotohosamu accuratemolecularclassificationofcancerusingsimplerules

Accurate molecular classification of cancer using simple rules

Ejemplares similares