Cargando…

Improving prediction accuracy of tumor classification by reusing genes discarded during gene selection

BACKGROUND: Since the high dimensionality of gene expression microarray data sets degrades the generalization performance of classifiers, feature selection, which selects relevant features and discards irrelevant and redundant features, has been widely used in the bioinformatics field. Multi-task le...

Descripción completa

Detalles Bibliográficos
Autores principales: Yang, Jack Y, Li, Guo-Zheng, Meng, Hao-Hua, Yang, Mary Qu, Deng, Youping
Formato: Texto
Lenguaje:English
Publicado: BioMed Central 2008
Materias:
Acceso en línea:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2386068/
https://www.ncbi.nlm.nih.gov/pubmed/18366616
http://dx.doi.org/10.1186/1471-2164-9-S1-S3
_version_ 1782155204869750784
author Yang, Jack Y
Li, Guo-Zheng
Meng, Hao-Hua
Yang, Mary Qu
Deng, Youping
author_facet Yang, Jack Y
Li, Guo-Zheng
Meng, Hao-Hua
Yang, Mary Qu
Deng, Youping
author_sort Yang, Jack Y
collection PubMed
description BACKGROUND: Since the high dimensionality of gene expression microarray data sets degrades the generalization performance of classifiers, feature selection, which selects relevant features and discards irrelevant and redundant features, has been widely used in the bioinformatics field. Multi-task learning is a novel technique to improve prediction accuracy of tumor classification by using information contained in such discarded redundant features, but which features should be discarded or used as input or output remains an open issue. RESULTS: We demonstrate a framework for automatically selecting features to be input, output, and discarded by using a genetic algorithm, and propose two algorithms: GA-MTL (Genetic algorithm based multi-task learning) and e-GA-MTL (an enhanced version of GA-MTL). Experimental results demonstrate that this framework is effective at selecting features for multi-task learning, and that GA-MTL and e-GA-MTL perform better than other heuristic methods. CONCLUSIONS: Genetic algorithms are a powerful technique to select features for multi-task learning automatically; GA-MTL and e-GA-MTL are shown to to improve generalization performance of classifiers on microarray data sets.
format Text
id pubmed-2386068
institution National Center for Biotechnology Information
language English
publishDate 2008
publisher BioMed Central
record_format MEDLINE/PubMed
spelling pubmed-23860682008-06-04 Improving prediction accuracy of tumor classification by reusing genes discarded during gene selection Yang, Jack Y Li, Guo-Zheng Meng, Hao-Hua Yang, Mary Qu Deng, Youping BMC Genomics Research BACKGROUND: Since the high dimensionality of gene expression microarray data sets degrades the generalization performance of classifiers, feature selection, which selects relevant features and discards irrelevant and redundant features, has been widely used in the bioinformatics field. Multi-task learning is a novel technique to improve prediction accuracy of tumor classification by using information contained in such discarded redundant features, but which features should be discarded or used as input or output remains an open issue. RESULTS: We demonstrate a framework for automatically selecting features to be input, output, and discarded by using a genetic algorithm, and propose two algorithms: GA-MTL (Genetic algorithm based multi-task learning) and e-GA-MTL (an enhanced version of GA-MTL). Experimental results demonstrate that this framework is effective at selecting features for multi-task learning, and that GA-MTL and e-GA-MTL perform better than other heuristic methods. CONCLUSIONS: Genetic algorithms are a powerful technique to select features for multi-task learning automatically; GA-MTL and e-GA-MTL are shown to to improve generalization performance of classifiers on microarray data sets. BioMed Central 2008-03-20 /pmc/articles/PMC2386068/ /pubmed/18366616 http://dx.doi.org/10.1186/1471-2164-9-S1-S3 Text en Copyright © 2008 Yang et al.; licensee BioMed Central Ltd. http://creativecommons.org/licenses/by/2.0 This is an open access article distributed under the terms of the Creative Commons Attribution License ( (http://creativecommons.org/licenses/by/2.0) ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
spellingShingle Research
Yang, Jack Y
Li, Guo-Zheng
Meng, Hao-Hua
Yang, Mary Qu
Deng, Youping
Improving prediction accuracy of tumor classification by reusing genes discarded during gene selection
title Improving prediction accuracy of tumor classification by reusing genes discarded during gene selection
title_full Improving prediction accuracy of tumor classification by reusing genes discarded during gene selection
title_fullStr Improving prediction accuracy of tumor classification by reusing genes discarded during gene selection
title_full_unstemmed Improving prediction accuracy of tumor classification by reusing genes discarded during gene selection
title_short Improving prediction accuracy of tumor classification by reusing genes discarded during gene selection
title_sort improving prediction accuracy of tumor classification by reusing genes discarded during gene selection
topic Research
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2386068/
https://www.ncbi.nlm.nih.gov/pubmed/18366616
http://dx.doi.org/10.1186/1471-2164-9-S1-S3
work_keys_str_mv AT yangjacky improvingpredictionaccuracyoftumorclassificationbyreusinggenesdiscardedduringgeneselection
AT liguozheng improvingpredictionaccuracyoftumorclassificationbyreusinggenesdiscardedduringgeneselection
AT menghaohua improvingpredictionaccuracyoftumorclassificationbyreusinggenesdiscardedduringgeneselection
AT yangmaryqu improvingpredictionaccuracyoftumorclassificationbyreusinggenesdiscardedduringgeneselection
AT dengyouping improvingpredictionaccuracyoftumorclassificationbyreusinggenesdiscardedduringgeneselection